You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by ja...@apache.org on 2016/02/17 13:39:36 UTC

[01/17] arrow git commit: ARROW-3: This patch includes a WIP draft specification document for the physical Arrow memory layout produced over a series of discussions amongst the to-be Arrow committers during late 2015. There are also a few small PNG diagr

Repository: arrow
Updated Branches:
  refs/heads/master d5aa7c466 -> 23c4b08d1


ARROW-3: This patch includes a WIP draft specification document for the physical Arrow memory layout produced over a series of discussions amongst the to-be Arrow committers during late 2015. There are also a few small PNG diagrams that illustrate some of the Arrow layout concepts.


Project: http://git-wip-us.apache.org/repos/asf/arrow/repo
Commit: http://git-wip-us.apache.org/repos/asf/arrow/commit/16e44e3d
Tree: http://git-wip-us.apache.org/repos/asf/arrow/tree/16e44e3d
Diff: http://git-wip-us.apache.org/repos/asf/arrow/diff/16e44e3d

Branch: refs/heads/master
Commit: 16e44e3d456219c48595142d0a6814c9c950d30c
Parents: fa5f029
Author: Wes McKinney <we...@cloudera.com>
Authored: Tue Feb 16 16:02:46 2016 -0800
Committer: Jacques Nadeau <ja...@apache.org>
Committed: Wed Feb 17 04:38:39 2016 -0800

----------------------------------------------------------------------
 format/Layout.md                           | 253 ++++++++++++++++++++++++
 format/README.md                           |   5 +
 format/diagrams/layout-dense-union.png     | Bin 0 -> 47999 bytes
 format/diagrams/layout-list-of-list.png    | Bin 0 -> 40105 bytes
 format/diagrams/layout-list-of-struct.png  | Bin 0 -> 60600 bytes
 format/diagrams/layout-list.png            | Bin 0 -> 15906 bytes
 format/diagrams/layout-primitive-array.png | Bin 0 -> 10907 bytes
 format/diagrams/layout-sparse-union.png    | Bin 0 -> 43020 bytes
 8 files changed, 258 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/arrow/blob/16e44e3d/format/Layout.md
----------------------------------------------------------------------
diff --git a/format/Layout.md b/format/Layout.md
new file mode 100644
index 0000000..c393163
--- /dev/null
+++ b/format/Layout.md
@@ -0,0 +1,253 @@
+# Arrow: Physical memory layout
+
+## Definitions / Terminology
+
+Since different projects have used differents words to describe various
+concepts, here is a small glossary to help disambiguate.
+
+* Array: a sequence of values with known length all having the same type.
+* Slot or array slot: a single logical value in an array of some particular data type
+* Contiguous memory region: a sequential virtual address space with a given
+  length. Any byte can be reached via a single pointer offset less than the
+  region’s length.
+* Primitive type: a data type that occupies a fixed-size memory slot specified
+  in bit width or byte width
+* Nested or parametric type: a data type whose full structure depends on one or
+  more other child relative types. Two fully-specified nested types are equal
+  if and only if their child types are equal. For example, `List<U>` is distinct
+  from `List<V>` iff U and V are different relative types.
+* Relative type or simply type (unqualified): either a specific primitive type
+  or a fully-specified nested type. When we say slot we mean a relative type
+  value, not necessarily any physical storage region.
+* Logical type: A data type that is implemented using some relative (physical)
+  type. For example, a Decimal value stored in 16 bytes could be stored in a
+  primitive array with slot size 16 bytes. Similarly, strings can be stored as
+  `List<1-byte>`.
+* Parent and child arrays: names to express relationships between physical
+  value arrays in a nested type structure. For example, a `List<T>`-type parent
+  array has a T-type array as its child (see more on lists below).
+* Leaf node or leaf: A primitive value array that may or may not be a child
+  array of some array with a nested type.
+
+## Requirements, goals, and non-goals
+
+Base requirements
+
+* A physical memory layout enabling zero-deserialization data interchange
+  amongst a variety of systems handling flat and nested columnar data, including
+  such systems as Spark, Drill, Impala, Kudu, Ibis, Spark, ODBC protocols, and
+  proprietary systems that utilize the open source components.
+* All array slots are accessible in constant time, with complexity growing
+  linearly in the nesting level
+* Capable of representing fully-materialized and decoded / decompressed Parquet
+  data
+* All leaf nodes (primitive value arrays) use contiguous memory regions
+* Each relative type can be nullable or non-nullable
+* Arrays are immutable once created. Implementations can provide APIs to mutate
+  an array, but applying mutations will require a new array data structure to
+  be built.
+* Arrays are relocatable (e.g. for RPC/transient storage) without pointer
+  swizzling. Another way of putting this is that contiguous memory regions can
+  be migrated to a different address space (e.g. via a memcpy-type of
+  operation) without altering their contents.
+
+## Goals (for this document)
+
+* To describe relative types (physical value types and a preliminary set of
+  nested types) sufficient for an unambiguous implementation
+* Memory layout and random access patterns for each relative type
+* Null representation for nullable types
+
+## Non-goals (for this document
+
+* To enumerate or specify logical types that can be implemented as primitive
+  (fixed-width) value types. For example: signed and unsigned integers,
+  floating point numbers, boolean, exact decimals, date and time types,
+  CHAR(K), VARCHAR(K), etc.
+* To specify standardized metadata or a data layout for RPC or transient file
+  storage.
+* To define a selection or masking vector construct
+* Implementation-specific details
+* Details of a user or developer C/C++/Java API.
+* Any “table” structure composed of named arrays each having their own type or
+  any other structure that composes arrays.
+* Any memory management or reference counting subsystem
+* To enumerate or specify types of encodings or compression support
+
+## Array lengths
+
+Any array has a known and fixed length, stored as a 32-bit signed integer, so a
+maximum of 2^31 - 1 elements. We choose a signed int32 for a couple reasons:
+
+* Enhance compatibility with Java and client languages which may have varying quality of support for unsigned integers.
+* To encourage developers to compose smaller arrays (each of which contains
+  contiguous memory in its leaf nodes) to create larger array structures
+  possibly exceeding 2^31 - 1 elements, as opposed to allocating very large
+  contiguous memory blocks.
+
+## Nullable and non-nullable arrays
+
+Any relative type can be nullable or non-nullable.
+
+Nullable arrays have a contiguous memory buffer, known as the null bitmask,
+whose length is large enough to have 1 bit for each array slot. Whether any
+array slot is null is encoded in the respective bits of this bitmask, i.e.:
+
+```
+is_null[j] -> bitmask[j / 8] & (1 << (j % 8))
+```
+
+Physically, non-nullable (NN) arrays do not have a null bitmask.
+
+For nested types, if the top-level nested type is nullable, it has its own
+bitmask regardless of whether the child types are nullable.
+
+## Primitive value arrays
+
+A primitive value array represents a fixed-length array of values each having
+the same physical slot width typically measured in bytes, though the spec also
+provides for bit-packed types (e.g. boolean values encoded in bits).
+
+Internally, the array contains a contiguous memory buffer whose total size is
+equal to the slot width multiplied by the array length. For bit-packed types,
+the size is rounded up to the nearest byte.
+
+The associated null bitmask (for nullable types) is contiguously allocated (as
+described above) but does not need to be adjacent in memory to the values
+buffer.
+
+(diagram not to scale)
+
+<img src="diagrams/layout-primitive-array.png" width="400"/>
+
+## List type
+
+List is a nested type in which each array slot contains a variable-size
+sequence of values all having the same relative type (heterogeneity can be
+achieved through unions, described later).
+
+A list type is specified like `List<T>`, where `T` is any relative type
+(primitive or nested).
+
+A list-array is represented by the combination of the following:
+
+* A values array, a child array of type T. T may also be a nested type.
+* An offsets array containing 32-bit signed integers with length equal to the
+  length of the top-level array plus one. Note that this limits the size of the
+  values array to 2^31 -1.
+
+The offsets array encodes a start position in the values array, and the length
+of the value in each slot is computed using the first difference with the next
+element in the offsets array. For example. the position and length of slot j is
+computed as:
+
+```
+slot_position = offsets[j]
+slot_length = offsets[j + 1] - offsets[j]  // (for 0 <= j < length)
+```
+
+The first value in the offsets array is 0, and the last element is the length
+of the values array.
+
+Let’s consider an example, the type `List<Char>`, where Char is a 1-byte
+logical type.
+
+For an array of length 3 with respective values:
+
+[[‘j’, ‘o’, ‘e’], null, [‘m’, ‘a’, ‘r’, ‘k’]]
+
+We have the following offsets and values arrays
+
+<img src="diagrams/layout-list.png" width="400"/>
+
+Let’s consider an array of a nested type, `List<List<byte>>`
+
+<img src="diagrams/layout-list-of-list.png" width="400"/>
+
+## Struct type
+
+A struct is a nested type parameterized by an ordered sequence of relative
+types (which can all be distinct), called its fields.
+
+Typically the fields have names, but the names and their types are part of the
+type metadata, not the physical memory layout.
+
+A struct does not have any additional allocated physical storage.
+
+Physically, a struct type has one child array for each field.
+
+For example, the struct (field names shown here as strings for illustration
+purposes)
+
+```
+Struct [nullable] <
+  name: String (= List<char>) [nullable],
+  age: Int32 [not-nullable]
+>
+```
+
+has two child arrays, one List<char> array (layout as above) and one
+non-nullable 4-byte physical value array having Int32 (not-null) logical
+type. Here is a diagram showing the full physical layout of this struct:
+
+<img src="diagrams/layout-list-of-struct.png" width="400"/>
+
+While a struct does not have physical storage for each of its semantic slots
+(i.e. each scalar C-like struct), an entire struct slot can be set to null via
+the bitmask. Whether each of the child field arrays can have null values
+depends on whether or not the respective relative type is nullable.
+
+## Dense union type
+
+A dense union is semantically similar to a struct, and contains an ordered
+sequence of relative types. While a struct contains multiple arrays, a union is
+semantically a single array in which each slot can have a different type.
+
+The union types may be named, but like structs this will be a matter of the
+metadata and will not affect the physical memory layout.
+
+We define two distinct union types that are optimized for different use
+cases. This first, the dense union, represents a mixed-type array with 6 bytes
+of overhead for each value. Its physical layout is as follows:
+
+* One child array for each relative type
+* Types array: An array of unsigned integers, enumerated from 0 corresponding
+  to each type, with the smallest byte width capable of representing the number
+  of types in the union.
+* Offsets array: An array of signed int32 values indicating the relative offset
+  into the respective child array for the type in a given slot. The respective
+  offsets for each child value array must be in order / increasing.
+
+Alternate proposal (TBD): the types and offset values may be packed into an
+int48 with 2 bytes for the type and 4 bytes for the offset.
+
+Critically, the dense union allows for minimal overhead in the ubiquitous
+union-of-structs with non-overlapping-fields use case (Union<s1: Struct1, s2:
+Struct2, s3: Struct3, …>)
+
+Here is a diagram of an example dense union:
+
+<img src="diagrams/layout-dense-union.png" width="400"/>
+
+## Sparse union type
+
+A sparse union has the same structure as a dense union, with the omission of
+the offsets array. In this case, the child arrays are each equal in length to
+the length of the union. This is analogous to a large struct in which all
+fields are nullable.
+
+While a sparse union may use significantly more space compared with a dense
+union, it has some advantages that may be desirable in certain use cases:
+
+<img src="diagrams/layout-sparse-union.png" width="400"/>
+
+More amenable to vectorized expression evaluation in some use cases.
+Equal-length arrays can be interpreted as a union by only defining the types array
+
+Note that nested types in a sparse union must be internally consistent
+(e.g. see the List in the diagram), i.e. random access at any index j yields
+the correct value.
+
+## References
+
+Drill docs https://drill.apache.org/docs/value-vectors/

http://git-wip-us.apache.org/repos/asf/arrow/blob/16e44e3d/format/README.md
----------------------------------------------------------------------
diff --git a/format/README.md b/format/README.md
new file mode 100644
index 0000000..1120e62
--- /dev/null
+++ b/format/README.md
@@ -0,0 +1,5 @@
+## Arrow specification documents
+
+> **Work-in-progress specification documents**. These are discussion documents
+> created by the Arrow developers during late 2015 and in no way represents a
+> finalized specification.

http://git-wip-us.apache.org/repos/asf/arrow/blob/16e44e3d/format/diagrams/layout-dense-union.png
----------------------------------------------------------------------
diff --git a/format/diagrams/layout-dense-union.png b/format/diagrams/layout-dense-union.png
new file mode 100644
index 0000000..5f1f381
Binary files /dev/null and b/format/diagrams/layout-dense-union.png differ

http://git-wip-us.apache.org/repos/asf/arrow/blob/16e44e3d/format/diagrams/layout-list-of-list.png
----------------------------------------------------------------------
diff --git a/format/diagrams/layout-list-of-list.png b/format/diagrams/layout-list-of-list.png
new file mode 100644
index 0000000..5bc0078
Binary files /dev/null and b/format/diagrams/layout-list-of-list.png differ

http://git-wip-us.apache.org/repos/asf/arrow/blob/16e44e3d/format/diagrams/layout-list-of-struct.png
----------------------------------------------------------------------
diff --git a/format/diagrams/layout-list-of-struct.png b/format/diagrams/layout-list-of-struct.png
new file mode 100644
index 0000000..00d6c6f
Binary files /dev/null and b/format/diagrams/layout-list-of-struct.png differ

http://git-wip-us.apache.org/repos/asf/arrow/blob/16e44e3d/format/diagrams/layout-list.png
----------------------------------------------------------------------
diff --git a/format/diagrams/layout-list.png b/format/diagrams/layout-list.png
new file mode 100644
index 0000000..167b10b
Binary files /dev/null and b/format/diagrams/layout-list.png differ

http://git-wip-us.apache.org/repos/asf/arrow/blob/16e44e3d/format/diagrams/layout-primitive-array.png
----------------------------------------------------------------------
diff --git a/format/diagrams/layout-primitive-array.png b/format/diagrams/layout-primitive-array.png
new file mode 100644
index 0000000..bd212f0
Binary files /dev/null and b/format/diagrams/layout-primitive-array.png differ

http://git-wip-us.apache.org/repos/asf/arrow/blob/16e44e3d/format/diagrams/layout-sparse-union.png
----------------------------------------------------------------------
diff --git a/format/diagrams/layout-sparse-union.png b/format/diagrams/layout-sparse-union.png
new file mode 100644
index 0000000..450ea29
Binary files /dev/null and b/format/diagrams/layout-sparse-union.png differ


[11/17] arrow git commit: ARROW-1: Initial Arrow Code Commit

Posted by ja...@apache.org.
ARROW-1: Initial Arrow Code Commit


Project: http://git-wip-us.apache.org/repos/asf/arrow/repo
Commit: http://git-wip-us.apache.org/repos/asf/arrow/commit/fa5f0299
Tree: http://git-wip-us.apache.org/repos/asf/arrow/tree/fa5f0299
Diff: http://git-wip-us.apache.org/repos/asf/arrow/diff/fa5f0299

Branch: refs/heads/master
Commit: fa5f0299f046c46e1b2f671e5e3b4f1956522711
Parents: cbc56bf
Author: Steven Phillips <st...@dremio.com>
Authored: Wed Feb 17 04:37:53 2016 -0800
Committer: Jacques Nadeau <ja...@apache.org>
Committed: Wed Feb 17 04:38:39 2016 -0800

----------------------------------------------------------------------
 java/.gitignore                                 |  22 +
 java/memory/pom.xml                             |  50 ++
 .../src/main/java/io/netty/buffer/ArrowBuf.java | 863 +++++++++++++++++++
 .../java/io/netty/buffer/ExpandableByteBuf.java |  55 ++
 .../main/java/io/netty/buffer/LargeBuffer.java  |  59 ++
 .../io/netty/buffer/MutableWrappedByteBuf.java  | 336 ++++++++
 .../netty/buffer/PooledByteBufAllocatorL.java   | 272 ++++++
 .../netty/buffer/UnsafeDirectLittleEndian.java  | 270 ++++++
 .../org/apache/arrow/memory/Accountant.java     | 272 ++++++
 .../apache/arrow/memory/AllocationManager.java  | 433 ++++++++++
 .../arrow/memory/AllocationReservation.java     |  86 ++
 .../arrow/memory/AllocatorClosedException.java  |  31 +
 .../org/apache/arrow/memory/BaseAllocator.java  | 781 +++++++++++++++++
 .../org/apache/arrow/memory/BoundsChecking.java |  35 +
 .../apache/arrow/memory/BufferAllocator.java    | 151 ++++
 .../org/apache/arrow/memory/BufferManager.java  |  66 ++
 .../org/apache/arrow/memory/ChildAllocator.java |  53 ++
 .../arrow/memory/DrillByteBufAllocator.java     | 141 +++
 .../arrow/memory/OutOfMemoryException.java      |  50 ++
 .../main/java/org/apache/arrow/memory/README.md | 121 +++
 .../org/apache/arrow/memory/RootAllocator.java  |  39 +
 .../org/apache/arrow/memory/package-info.java   |  24 +
 .../apache/arrow/memory/util/AssertionUtil.java |  37 +
 .../arrow/memory/util/AutoCloseableLock.java    |  43 +
 .../apache/arrow/memory/util/HistoricalLog.java | 185 ++++
 .../org/apache/arrow/memory/util/Metrics.java   |  40 +
 .../org/apache/arrow/memory/util/Pointer.java   |  28 +
 .../apache/arrow/memory/util/StackTrace.java    |  70 ++
 .../memory/src/main/resources/drill-module.conf |  25 +
 .../org/apache/arrow/memory/TestAccountant.java | 164 ++++
 .../apache/arrow/memory/TestBaseAllocator.java  | 648 ++++++++++++++
 .../org/apache/arrow/memory/TestEndianess.java  |  43 +
 java/pom.xml                                    | 470 ++++++++++
 java/vector/pom.xml                             | 165 ++++
 java/vector/src/main/codegen/config.fmpp        |  24 +
 .../src/main/codegen/data/ValueVectorTypes.tdd  | 168 ++++
 .../src/main/codegen/includes/license.ftl       |  18 +
 .../src/main/codegen/includes/vv_imports.ftl    |  62 ++
 .../codegen/templates/AbstractFieldReader.java  | 124 +++
 .../codegen/templates/AbstractFieldWriter.java  | 147 ++++
 .../AbstractPromotableFieldWriter.java          | 142 +++
 .../src/main/codegen/templates/BaseReader.java  |  73 ++
 .../src/main/codegen/templates/BaseWriter.java  | 117 +++
 .../main/codegen/templates/BasicTypeHelper.java | 538 ++++++++++++
 .../main/codegen/templates/ComplexCopier.java   | 133 +++
 .../main/codegen/templates/ComplexReaders.java  | 183 ++++
 .../main/codegen/templates/ComplexWriters.java  | 151 ++++
 .../codegen/templates/FixedValueVectors.java    | 813 +++++++++++++++++
 .../codegen/templates/HolderReaderImpl.java     | 290 +++++++
 .../src/main/codegen/templates/ListWriters.java | 234 +++++
 .../src/main/codegen/templates/MapWriters.java  | 240 ++++++
 .../src/main/codegen/templates/NullReader.java  | 138 +++
 .../codegen/templates/NullableValueVectors.java | 630 ++++++++++++++
 .../codegen/templates/RepeatedValueVectors.java | 421 +++++++++
 .../main/codegen/templates/UnionListWriter.java | 185 ++++
 .../src/main/codegen/templates/UnionReader.java | 194 +++++
 .../src/main/codegen/templates/UnionVector.java | 467 ++++++++++
 .../src/main/codegen/templates/UnionWriter.java | 228 +++++
 .../main/codegen/templates/ValueHolders.java    | 116 +++
 .../templates/VariableLengthVectors.java        | 644 ++++++++++++++
 .../org/apache/arrow/vector/AddOrGetResult.java |  38 +
 .../apache/arrow/vector/AllocationHelper.java   |  61 ++
 .../arrow/vector/BaseDataValueVector.java       |  91 ++
 .../apache/arrow/vector/BaseValueVector.java    | 125 +++
 .../java/org/apache/arrow/vector/BitVector.java | 450 ++++++++++
 .../apache/arrow/vector/FixedWidthVector.java   |  35 +
 .../org/apache/arrow/vector/NullableVector.java |  23 +
 .../vector/NullableVectorDefinitionSetter.java  |  23 +
 .../org/apache/arrow/vector/ObjectVector.java   | 220 +++++
 .../arrow/vector/SchemaChangeCallBack.java      |  52 ++
 .../apache/arrow/vector/ValueHolderHelper.java  | 203 +++++
 .../org/apache/arrow/vector/ValueVector.java    | 222 +++++
 .../arrow/vector/VariableWidthVector.java       |  51 ++
 .../apache/arrow/vector/VectorDescriptor.java   |  83 ++
 .../org/apache/arrow/vector/VectorTrimmer.java  |  33 +
 .../org/apache/arrow/vector/ZeroVector.java     | 181 ++++
 .../vector/complex/AbstractContainerVector.java | 143 +++
 .../arrow/vector/complex/AbstractMapVector.java | 278 ++++++
 .../vector/complex/BaseRepeatedValueVector.java | 260 ++++++
 .../vector/complex/ContainerVectorLike.java     |  43 +
 .../vector/complex/EmptyValuePopulator.java     |  54 ++
 .../apache/arrow/vector/complex/ListVector.java | 321 +++++++
 .../apache/arrow/vector/complex/MapVector.java  | 374 ++++++++
 .../arrow/vector/complex/Positionable.java      |  22 +
 .../complex/RepeatedFixedWidthVectorLike.java   |  40 +
 .../vector/complex/RepeatedListVector.java      | 428 +++++++++
 .../arrow/vector/complex/RepeatedMapVector.java | 584 +++++++++++++
 .../vector/complex/RepeatedValueVector.java     |  85 ++
 .../RepeatedVariableWidthVectorLike.java        |  35 +
 .../apache/arrow/vector/complex/StateTool.java  |  34 +
 .../arrow/vector/complex/VectorWithOrdinal.java |  30 +
 .../vector/complex/impl/AbstractBaseReader.java | 100 +++
 .../vector/complex/impl/AbstractBaseWriter.java |  59 ++
 .../vector/complex/impl/ComplexWriterImpl.java  | 193 +++++
 .../complex/impl/MapOrListWriterImpl.java       | 112 +++
 .../vector/complex/impl/PromotableWriter.java   | 196 +++++
 .../complex/impl/RepeatedListReaderImpl.java    | 145 ++++
 .../complex/impl/RepeatedMapReaderImpl.java     | 192 +++++
 .../impl/SingleLikeRepeatedMapReaderImpl.java   |  89 ++
 .../complex/impl/SingleListReaderImpl.java      |  88 ++
 .../complex/impl/SingleMapReaderImpl.java       | 108 +++
 .../vector/complex/impl/UnionListReader.java    |  98 +++
 .../vector/complex/reader/FieldReader.java      |  29 +
 .../vector/complex/writer/FieldWriter.java      |  27 +
 .../arrow/vector/holders/ComplexHolder.java     |  25 +
 .../arrow/vector/holders/ObjectHolder.java      |  38 +
 .../vector/holders/RepeatedListHolder.java      |  23 +
 .../arrow/vector/holders/RepeatedMapHolder.java |  23 +
 .../arrow/vector/holders/UnionHolder.java       |  37 +
 .../arrow/vector/holders/ValueHolder.java       |  31 +
 .../arrow/vector/types/MaterializedField.java   | 217 +++++
 .../org/apache/arrow/vector/types/Types.java    | 132 +++
 .../arrow/vector/util/ByteFunctionHelpers.java  | 233 +++++
 .../org/apache/arrow/vector/util/CallBack.java  |  23 +
 .../arrow/vector/util/CoreDecimalUtility.java   |  91 ++
 .../apache/arrow/vector/util/DateUtility.java   | 682 +++++++++++++++
 .../arrow/vector/util/DecimalUtility.java       | 737 ++++++++++++++++
 .../arrow/vector/util/JsonStringArrayList.java  |  57 ++
 .../arrow/vector/util/JsonStringHashMap.java    |  76 ++
 .../arrow/vector/util/MapWithOrdinal.java       | 248 ++++++
 .../util/OversizedAllocationException.java      |  49 ++
 .../util/SchemaChangeRuntimeException.java      |  41 +
 .../java/org/apache/arrow/vector/util/Text.java | 621 +++++++++++++
 .../apache/arrow/vector/util/TransferPair.java  |  27 +
 124 files changed, 22077 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/.gitignore
----------------------------------------------------------------------
diff --git a/java/.gitignore b/java/.gitignore
new file mode 100644
index 0000000..73c1be4
--- /dev/null
+++ b/java/.gitignore
@@ -0,0 +1,22 @@
+.project
+.buildpath
+.classpath
+.checkstyle
+.settings/
+.idea/
+TAGS
+*.log
+*.lck
+*.iml
+target/
+*.DS_Store
+*.patch
+*~
+git.properties
+contrib/native/client/build/
+contrib/native/client/build/*
+CMakeCache.txt
+CMakeFiles
+Makefile
+cmake_install.cmake
+install_manifest.txt

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/pom.xml
----------------------------------------------------------------------
diff --git a/java/memory/pom.xml b/java/memory/pom.xml
new file mode 100644
index 0000000..44332f5
--- /dev/null
+++ b/java/memory/pom.xml
@@ -0,0 +1,50 @@
+<?xml version="1.0"?>
+<!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor 
+  license agreements. See the NOTICE file distributed with this work for additional 
+  information regarding copyright ownership. The ASF licenses this file to 
+  You under the Apache License, Version 2.0 (the "License"); you may not use 
+  this file except in compliance with the License. You may obtain a copy of 
+  the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required 
+  by applicable law or agreed to in writing, software distributed under the 
+  License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS 
+  OF ANY KIND, either express or implied. See the License for the specific 
+  language governing permissions and limitations under the License. -->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+  <modelVersion>4.0.0</modelVersion>
+  <parent>
+    <groupId>org.apache.arrow</groupId>
+    <artifactId>arrow-java-root</artifactId>
+    <version>0.1-SNAPSHOT</version>
+  </parent>
+  <artifactId>arrow-memory</artifactId>
+  <name>arrow-memory</name>
+
+  <dependencies>
+
+    <dependency>
+      <groupId>com.codahale.metrics</groupId>
+      <artifactId>metrics-core</artifactId>
+      <version>3.0.1</version>
+    </dependency>
+
+    <dependency>
+      <groupId>com.google.code.findbugs</groupId>
+      <artifactId>jsr305</artifactId>
+      <version>3.0.1</version>
+    </dependency>
+
+    <dependency>
+      <groupId>com.carrotsearch</groupId>
+      <artifactId>hppc</artifactId>
+      <version>0.7.1</version>
+    </dependency>
+  </dependencies>
+
+
+  <build>
+  </build>
+
+
+
+</project>

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/io/netty/buffer/ArrowBuf.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/io/netty/buffer/ArrowBuf.java b/java/memory/src/main/java/io/netty/buffer/ArrowBuf.java
new file mode 100644
index 0000000..f033ba6
--- /dev/null
+++ b/java/memory/src/main/java/io/netty/buffer/ArrowBuf.java
@@ -0,0 +1,863 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package io.netty.buffer;
+
+import io.netty.util.internal.PlatformDependent;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.nio.channels.GatheringByteChannel;
+import java.nio.channels.ScatteringByteChannel;
+import java.nio.charset.Charset;
+import java.util.concurrent.atomic.AtomicInteger;
+import java.util.concurrent.atomic.AtomicLong;
+
+import org.apache.arrow.memory.BaseAllocator;
+import org.apache.arrow.memory.BoundsChecking;
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.memory.BufferManager;
+import org.apache.arrow.memory.AllocationManager.BufferLedger;
+import org.apache.arrow.memory.BaseAllocator.Verbosity;
+import org.apache.arrow.memory.util.HistoricalLog;
+
+import com.google.common.base.Preconditions;
+
+public final class ArrowBuf extends AbstractByteBuf implements AutoCloseable {
+  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(ArrowBuf.class);
+
+  private static final AtomicLong idGenerator = new AtomicLong(0);
+
+  private final long id = idGenerator.incrementAndGet();
+  private final AtomicInteger refCnt;
+  private final UnsafeDirectLittleEndian udle;
+  private final long addr;
+  private final int offset;
+  private final BufferLedger ledger;
+  private final BufferManager bufManager;
+  private final ByteBufAllocator alloc;
+  private final boolean isEmpty;
+  private volatile int length;
+  private final HistoricalLog historicalLog = BaseAllocator.DEBUG ?
+      new HistoricalLog(BaseAllocator.DEBUG_LOG_LENGTH, "DrillBuf[%d]", id) : null;
+
+  public ArrowBuf(
+      final AtomicInteger refCnt,
+      final BufferLedger ledger,
+      final UnsafeDirectLittleEndian byteBuf,
+      final BufferManager manager,
+      final ByteBufAllocator alloc,
+      final int offset,
+      final int length,
+      boolean isEmpty) {
+    super(byteBuf.maxCapacity());
+    this.refCnt = refCnt;
+    this.udle = byteBuf;
+    this.isEmpty = isEmpty;
+    this.bufManager = manager;
+    this.alloc = alloc;
+    this.addr = byteBuf.memoryAddress() + offset;
+    this.ledger = ledger;
+    this.length = length;
+    this.offset = offset;
+
+    if (BaseAllocator.DEBUG) {
+      historicalLog.recordEvent("create()");
+    }
+
+  }
+
+  public ArrowBuf reallocIfNeeded(final int size) {
+    Preconditions.checkArgument(size >= 0, "reallocation size must be non-negative");
+
+    if (this.capacity() >= size) {
+      return this;
+    }
+
+    if (bufManager != null) {
+      return bufManager.replace(this, size);
+    } else {
+      throw new UnsupportedOperationException("Realloc is only available in the context of an operator's UDFs");
+    }
+  }
+
+  @Override
+  public int refCnt() {
+    if (isEmpty) {
+      return 1;
+    } else {
+      return refCnt.get();
+    }
+  }
+
+  private long addr(int index) {
+    return addr + index;
+  }
+
+  private final void checkIndexD(int index, int fieldLength) {
+    ensureAccessible();
+    if (fieldLength < 0) {
+      throw new IllegalArgumentException("length: " + fieldLength + " (expected: >= 0)");
+    }
+    if (index < 0 || index > capacity() - fieldLength) {
+      if (BaseAllocator.DEBUG) {
+        historicalLog.logHistory(logger);
+      }
+      throw new IndexOutOfBoundsException(String.format(
+          "index: %d, length: %d (expected: range(0, %d))", index, fieldLength, capacity()));
+    }
+  }
+
+  /**
+   * Allows a function to determine whether not reading a particular string of bytes is valid.
+   *
+   * Will throw an exception if the memory is not readable for some reason. Only doesn't something in the case that
+   * AssertionUtil.BOUNDS_CHECKING_ENABLED is true.
+   *
+   * @param start
+   *          The starting position of the bytes to be read.
+   * @param end
+   *          The exclusive endpoint of the bytes to be read.
+   */
+  public void checkBytes(int start, int end) {
+    if (BoundsChecking.BOUNDS_CHECKING_ENABLED) {
+      checkIndexD(start, end - start);
+    }
+  }
+
+  private void chk(int index, int width) {
+    if (BoundsChecking.BOUNDS_CHECKING_ENABLED) {
+      checkIndexD(index, width);
+    }
+  }
+
+  private void ensure(int width) {
+    if (BoundsChecking.BOUNDS_CHECKING_ENABLED) {
+      ensureWritable(width);
+    }
+  }
+
+  /**
+   * Create a new DrillBuf that is associated with an alternative allocator for the purposes of memory ownership and
+   * accounting. This has no impact on the reference counting for the current DrillBuf except in the situation where the
+   * passed in Allocator is the same as the current buffer.
+   *
+   * This operation has no impact on the reference count of this DrillBuf. The newly created DrillBuf with either have a
+   * reference count of 1 (in the case that this is the first time this memory is being associated with the new
+   * allocator) or the current value of the reference count + 1 for the other AllocationManager/BufferLedger combination
+   * in the case that the provided allocator already had an association to this underlying memory.
+   *
+   * @param target
+   *          The target allocator to create an association with.
+   * @return A new DrillBuf which shares the same underlying memory as this DrillBuf.
+   */
+  public ArrowBuf retain(BufferAllocator target) {
+
+    if (isEmpty) {
+      return this;
+    }
+
+    if (BaseAllocator.DEBUG) {
+      historicalLog.recordEvent("retain(%s)", target.getName());
+    }
+    final BufferLedger otherLedger = this.ledger.getLedgerForAllocator(target);
+    return otherLedger.newDrillBuf(offset, length, null);
+  }
+
+  /**
+   * Transfer the memory accounting ownership of this DrillBuf to another allocator. This will generate a new DrillBuf
+   * that carries an association with the underlying memory of this DrillBuf. If this DrillBuf is connected to the
+   * owning BufferLedger of this memory, that memory ownership/accounting will be transferred to the taret allocator. If
+   * this DrillBuf does not currently own the memory underlying it (and is only associated with it), this does not
+   * transfer any ownership to the newly created DrillBuf.
+   *
+   * This operation has no impact on the reference count of this DrillBuf. The newly created DrillBuf with either have a
+   * reference count of 1 (in the case that this is the first time this memory is being associated with the new
+   * allocator) or the current value of the reference count for the other AllocationManager/BufferLedger combination in
+   * the case that the provided allocator already had an association to this underlying memory.
+   *
+   * Transfers will always succeed, even if that puts the other allocator into an overlimit situation. This is possible
+   * due to the fact that the original owning allocator may have allocated this memory out of a local reservation
+   * whereas the target allocator may need to allocate new memory from a parent or RootAllocator. This operation is done
+   * in a mostly-lockless but consistent manner. As such, the overlimit==true situation could occur slightly prematurely
+   * to an actual overlimit==true condition. This is simply conservative behavior which means we may return overlimit
+   * slightly sooner than is necessary.
+   *
+   * @param target
+   *          The allocator to transfer ownership to.
+   * @return A new transfer result with the impact of the transfer (whether it was overlimit) as well as the newly
+   *         created DrillBuf.
+   */
+  public TransferResult transferOwnership(BufferAllocator target) {
+
+    if (isEmpty) {
+      return new TransferResult(true, this);
+    }
+
+    final BufferLedger otherLedger = this.ledger.getLedgerForAllocator(target);
+    final ArrowBuf newBuf = otherLedger.newDrillBuf(offset, length, null);
+    final boolean allocationFit = this.ledger.transferBalance(otherLedger);
+    return new TransferResult(allocationFit, newBuf);
+  }
+
+  /**
+   * The outcome of a Transfer.
+   */
+  public class TransferResult {
+
+    /**
+     * Whether this transfer fit within the target allocator's capacity.
+     */
+    public final boolean allocationFit;
+
+    /**
+     * The newly created buffer associated with the target allocator.
+     */
+    public final ArrowBuf buffer;
+
+    private TransferResult(boolean allocationFit, ArrowBuf buffer) {
+      this.allocationFit = allocationFit;
+      this.buffer = buffer;
+    }
+
+  }
+
+  @Override
+  public boolean release() {
+    return release(1);
+  }
+
+  /**
+   * Release the provided number of reference counts.
+   */
+  @Override
+  public boolean release(int decrement) {
+
+    if (isEmpty) {
+      return false;
+    }
+
+    if (decrement < 1) {
+      throw new IllegalStateException(String.format("release(%d) argument is not positive. Buffer Info: %s",
+          decrement, toVerboseString()));
+    }
+
+    final int refCnt = ledger.decrement(decrement);
+
+    if (BaseAllocator.DEBUG) {
+      historicalLog.recordEvent("release(%d). original value: %d", decrement, refCnt + decrement);
+    }
+
+    if (refCnt < 0) {
+      throw new IllegalStateException(
+          String.format("DrillBuf[%d] refCnt has gone negative. Buffer Info: %s", id, toVerboseString()));
+    }
+
+    return refCnt == 0;
+
+  }
+
+  @Override
+  public int capacity() {
+    return length;
+  }
+
+  @Override
+  public synchronized ArrowBuf capacity(int newCapacity) {
+
+    if (newCapacity == length) {
+      return this;
+    }
+
+    Preconditions.checkArgument(newCapacity >= 0);
+
+    if (newCapacity < length) {
+      length = newCapacity;
+      return this;
+    }
+
+    throw new UnsupportedOperationException("Buffers don't support resizing that increases the size.");
+  }
+
+  @Override
+  public ByteBufAllocator alloc() {
+    return udle.alloc();
+  }
+
+  @Override
+  public ByteOrder order() {
+    return ByteOrder.LITTLE_ENDIAN;
+  }
+
+  @Override
+  public ByteBuf order(ByteOrder endianness) {
+    return this;
+  }
+
+  @Override
+  public ByteBuf unwrap() {
+    return udle;
+  }
+
+  @Override
+  public boolean isDirect() {
+    return true;
+  }
+
+  @Override
+  public ByteBuf readBytes(int length) {
+    throw new UnsupportedOperationException();
+  }
+
+  @Override
+  public ByteBuf readSlice(int length) {
+    final ByteBuf slice = slice(readerIndex(), length);
+    readerIndex(readerIndex() + length);
+    return slice;
+  }
+
+  @Override
+  public ByteBuf copy() {
+    throw new UnsupportedOperationException();
+  }
+
+  @Override
+  public ByteBuf copy(int index, int length) {
+    throw new UnsupportedOperationException();
+  }
+
+  @Override
+  public ByteBuf slice() {
+    return slice(readerIndex(), readableBytes());
+  }
+
+  public static String bufferState(final ByteBuf buf) {
+    final int cap = buf.capacity();
+    final int mcap = buf.maxCapacity();
+    final int ri = buf.readerIndex();
+    final int rb = buf.readableBytes();
+    final int wi = buf.writerIndex();
+    final int wb = buf.writableBytes();
+    return String.format("cap/max: %d/%d, ri: %d, rb: %d, wi: %d, wb: %d",
+        cap, mcap, ri, rb, wi, wb);
+  }
+
+  @Override
+  public ArrowBuf slice(int index, int length) {
+
+    if (isEmpty) {
+      return this;
+    }
+
+    /*
+     * Re the behavior of reference counting, see http://netty.io/wiki/reference-counted-objects.html#wiki-h3-5, which
+     * explains that derived buffers share their reference count with their parent
+     */
+    final ArrowBuf newBuf = ledger.newDrillBuf(offset + index, length);
+    newBuf.writerIndex(length);
+    return newBuf;
+  }
+
+  @Override
+  public ArrowBuf duplicate() {
+    return slice(0, length);
+  }
+
+  @Override
+  public int nioBufferCount() {
+    return 1;
+  }
+
+  @Override
+  public ByteBuffer nioBuffer() {
+    return nioBuffer(readerIndex(), readableBytes());
+  }
+
+  @Override
+  public ByteBuffer nioBuffer(int index, int length) {
+    return udle.nioBuffer(offset + index, length);
+  }
+
+  @Override
+  public ByteBuffer internalNioBuffer(int index, int length) {
+    return udle.internalNioBuffer(offset + index, length);
+  }
+
+  @Override
+  public ByteBuffer[] nioBuffers() {
+    return new ByteBuffer[] { nioBuffer() };
+  }
+
+  @Override
+  public ByteBuffer[] nioBuffers(int index, int length) {
+    return new ByteBuffer[] { nioBuffer(index, length) };
+  }
+
+  @Override
+  public boolean hasArray() {
+    return udle.hasArray();
+  }
+
+  @Override
+  public byte[] array() {
+    return udle.array();
+  }
+
+  @Override
+  public int arrayOffset() {
+    return udle.arrayOffset();
+  }
+
+  @Override
+  public boolean hasMemoryAddress() {
+    return true;
+  }
+
+  @Override
+  public long memoryAddress() {
+    return this.addr;
+  }
+
+  @Override
+  public String toString() {
+    return String.format("DrillBuf[%d], udle: [%d %d..%d]", id, udle.id, offset, offset + capacity());
+  }
+
+  @Override
+  public String toString(Charset charset) {
+    return toString(readerIndex, readableBytes(), charset);
+  }
+
+  @Override
+  public String toString(int index, int length, Charset charset) {
+
+    if (length == 0) {
+      return "";
+    }
+
+    return ByteBufUtil.decodeString(nioBuffer(index, length), charset);
+  }
+
+  @Override
+  public int hashCode() {
+    return System.identityHashCode(this);
+  }
+
+  @Override
+  public boolean equals(Object obj) {
+    // identity equals only.
+    return this == obj;
+  }
+
+  @Override
+  public ByteBuf retain(int increment) {
+    Preconditions.checkArgument(increment > 0, "retain(%d) argument is not positive", increment);
+
+    if (isEmpty) {
+      return this;
+    }
+
+    if (BaseAllocator.DEBUG) {
+      historicalLog.recordEvent("retain(%d)", increment);
+    }
+
+    final int originalReferenceCount = refCnt.getAndAdd(increment);
+    Preconditions.checkArgument(originalReferenceCount > 0);
+    return this;
+  }
+
+  @Override
+  public ByteBuf retain() {
+    return retain(1);
+  }
+
+  @Override
+  public long getLong(int index) {
+    chk(index, 8);
+    final long v = PlatformDependent.getLong(addr(index));
+    return v;
+  }
+
+  @Override
+  public float getFloat(int index) {
+    return Float.intBitsToFloat(getInt(index));
+  }
+
+  @Override
+  public double getDouble(int index) {
+    return Double.longBitsToDouble(getLong(index));
+  }
+
+  @Override
+  public char getChar(int index) {
+    return (char) getShort(index);
+  }
+
+  @Override
+  public long getUnsignedInt(int index) {
+    return getInt(index) & 0xFFFFFFFFL;
+  }
+
+  @Override
+  public int getInt(int index) {
+    chk(index, 4);
+    final int v = PlatformDependent.getInt(addr(index));
+    return v;
+  }
+
+  @Override
+  public int getUnsignedShort(int index) {
+    return getShort(index) & 0xFFFF;
+  }
+
+  @Override
+  public short getShort(int index) {
+    chk(index, 2);
+    short v = PlatformDependent.getShort(addr(index));
+    return v;
+  }
+
+  @Override
+  public ByteBuf setShort(int index, int value) {
+    chk(index, 2);
+    PlatformDependent.putShort(addr(index), (short) value);
+    return this;
+  }
+
+  @Override
+  public ByteBuf setInt(int index, int value) {
+    chk(index, 4);
+    PlatformDependent.putInt(addr(index), value);
+    return this;
+  }
+
+  @Override
+  public ByteBuf setLong(int index, long value) {
+    chk(index, 8);
+    PlatformDependent.putLong(addr(index), value);
+    return this;
+  }
+
+  @Override
+  public ByteBuf setChar(int index, int value) {
+    chk(index, 2);
+    PlatformDependent.putShort(addr(index), (short) value);
+    return this;
+  }
+
+  @Override
+  public ByteBuf setFloat(int index, float value) {
+    chk(index, 4);
+    PlatformDependent.putInt(addr(index), Float.floatToRawIntBits(value));
+    return this;
+  }
+
+  @Override
+  public ByteBuf setDouble(int index, double value) {
+    chk(index, 8);
+    PlatformDependent.putLong(addr(index), Double.doubleToRawLongBits(value));
+    return this;
+  }
+
+  @Override
+  public ByteBuf writeShort(int value) {
+    ensure(2);
+    PlatformDependent.putShort(addr(writerIndex), (short) value);
+    writerIndex += 2;
+    return this;
+  }
+
+  @Override
+  public ByteBuf writeInt(int value) {
+    ensure(4);
+    PlatformDependent.putInt(addr(writerIndex), value);
+    writerIndex += 4;
+    return this;
+  }
+
+  @Override
+  public ByteBuf writeLong(long value) {
+    ensure(8);
+    PlatformDependent.putLong(addr(writerIndex), value);
+    writerIndex += 8;
+    return this;
+  }
+
+  @Override
+  public ByteBuf writeChar(int value) {
+    ensure(2);
+    PlatformDependent.putShort(addr(writerIndex), (short) value);
+    writerIndex += 2;
+    return this;
+  }
+
+  @Override
+  public ByteBuf writeFloat(float value) {
+    ensure(4);
+    PlatformDependent.putInt(addr(writerIndex), Float.floatToRawIntBits(value));
+    writerIndex += 4;
+    return this;
+  }
+
+  @Override
+  public ByteBuf writeDouble(double value) {
+    ensure(8);
+    PlatformDependent.putLong(addr(writerIndex), Double.doubleToRawLongBits(value));
+    writerIndex += 8;
+    return this;
+  }
+
+  @Override
+  public ByteBuf getBytes(int index, byte[] dst, int dstIndex, int length) {
+    udle.getBytes(index + offset, dst, dstIndex, length);
+    return this;
+  }
+
+  @Override
+  public ByteBuf getBytes(int index, ByteBuffer dst) {
+    udle.getBytes(index + offset, dst);
+    return this;
+  }
+
+  @Override
+  public ByteBuf setByte(int index, int value) {
+    chk(index, 1);
+    PlatformDependent.putByte(addr(index), (byte) value);
+    return this;
+  }
+
+  public void setByte(int index, byte b) {
+    chk(index, 1);
+    PlatformDependent.putByte(addr(index), b);
+  }
+
+  public void writeByteUnsafe(byte b) {
+    PlatformDependent.putByte(addr(readerIndex), b);
+    readerIndex++;
+  }
+
+  @Override
+  protected byte _getByte(int index) {
+    return getByte(index);
+  }
+
+  @Override
+  protected short _getShort(int index) {
+    return getShort(index);
+  }
+
+  @Override
+  protected int _getInt(int index) {
+    return getInt(index);
+  }
+
+  @Override
+  protected long _getLong(int index) {
+    return getLong(index);
+  }
+
+  @Override
+  protected void _setByte(int index, int value) {
+    setByte(index, value);
+  }
+
+  @Override
+  protected void _setShort(int index, int value) {
+    setShort(index, value);
+  }
+
+  @Override
+  protected void _setMedium(int index, int value) {
+    setMedium(index, value);
+  }
+
+  @Override
+  protected void _setInt(int index, int value) {
+    setInt(index, value);
+  }
+
+  @Override
+  protected void _setLong(int index, long value) {
+    setLong(index, value);
+  }
+
+  @Override
+  public ByteBuf getBytes(int index, ByteBuf dst, int dstIndex, int length) {
+    udle.getBytes(index + offset, dst, dstIndex, length);
+    return this;
+  }
+
+  @Override
+  public ByteBuf getBytes(int index, OutputStream out, int length) throws IOException {
+    udle.getBytes(index + offset, out, length);
+    return this;
+  }
+
+  @Override
+  protected int _getUnsignedMedium(int index) {
+    final long addr = addr(index);
+    return (PlatformDependent.getByte(addr) & 0xff) << 16 |
+        (PlatformDependent.getByte(addr + 1) & 0xff) << 8 |
+        PlatformDependent.getByte(addr + 2) & 0xff;
+  }
+
+  @Override
+  public int getBytes(int index, GatheringByteChannel out, int length) throws IOException {
+    return udle.getBytes(index + offset, out, length);
+  }
+
+  @Override
+  public ByteBuf setBytes(int index, ByteBuf src, int srcIndex, int length) {
+    udle.setBytes(index + offset, src, srcIndex, length);
+    return this;
+  }
+
+  public ByteBuf setBytes(int index, ByteBuffer src, int srcIndex, int length) {
+    if (src.isDirect()) {
+      checkIndex(index, length);
+      PlatformDependent.copyMemory(PlatformDependent.directBufferAddress(src) + srcIndex, this.memoryAddress() + index,
+          length);
+    } else {
+      if (srcIndex == 0 && src.capacity() == length) {
+        udle.setBytes(index + offset, src);
+      } else {
+        ByteBuffer newBuf = src.duplicate();
+        newBuf.position(srcIndex);
+        newBuf.limit(srcIndex + length);
+        udle.setBytes(index + offset, src);
+      }
+    }
+
+    return this;
+  }
+
+  @Override
+  public ByteBuf setBytes(int index, byte[] src, int srcIndex, int length) {
+    udle.setBytes(index + offset, src, srcIndex, length);
+    return this;
+  }
+
+  @Override
+  public ByteBuf setBytes(int index, ByteBuffer src) {
+    udle.setBytes(index + offset, src);
+    return this;
+  }
+
+  @Override
+  public int setBytes(int index, InputStream in, int length) throws IOException {
+    return udle.setBytes(index + offset, in, length);
+  }
+
+  @Override
+  public int setBytes(int index, ScatteringByteChannel in, int length) throws IOException {
+    return udle.setBytes(index + offset, in, length);
+  }
+
+  @Override
+  public byte getByte(int index) {
+    chk(index, 1);
+    return PlatformDependent.getByte(addr(index));
+  }
+
+  @Override
+  public void close() {
+    release();
+  }
+
+  /**
+   * Returns the possible memory consumed by this DrillBuf in the worse case scenario. (not shared, connected to larger
+   * underlying buffer of allocated memory)
+   *
+   * @return Size in bytes.
+   */
+  public int getPossibleMemoryConsumed() {
+    return ledger.getSize();
+  }
+
+  /**
+   * Return that is Accounted for by this buffer (and its potentially shared siblings within the context of the
+   * associated allocator).
+   *
+   * @return Size in bytes.
+   */
+  public int getActualMemoryConsumed() {
+    return ledger.getAccountedSize();
+  }
+
+  private final static int LOG_BYTES_PER_ROW = 10;
+
+  /**
+   * Return the buffer's byte contents in the form of a hex dump.
+   *
+   * @param start
+   *          the starting byte index
+   * @param length
+   *          how many bytes to log
+   * @return A hex dump in a String.
+   */
+  public String toHexString(final int start, final int length) {
+    final int roundedStart = (start / LOG_BYTES_PER_ROW) * LOG_BYTES_PER_ROW;
+
+    final StringBuilder sb = new StringBuilder("buffer byte dump\n");
+    int index = roundedStart;
+    for (int nLogged = 0; nLogged < length; nLogged += LOG_BYTES_PER_ROW) {
+      sb.append(String.format(" [%05d-%05d]", index, index + LOG_BYTES_PER_ROW - 1));
+      for (int i = 0; i < LOG_BYTES_PER_ROW; ++i) {
+        try {
+          final byte b = getByte(index++);
+          sb.append(String.format(" 0x%02x", b));
+        } catch (IndexOutOfBoundsException ioob) {
+          sb.append(" <ioob>");
+        }
+      }
+      sb.append('\n');
+    }
+    return sb.toString();
+  }
+
+  /**
+   * Get the integer id assigned to this DrillBuf for debugging purposes.
+   *
+   * @return integer id
+   */
+  public long getId() {
+    return id;
+  }
+
+  public String toVerboseString() {
+    if (isEmpty) {
+      return toString();
+    }
+
+    StringBuilder sb = new StringBuilder();
+    ledger.print(sb, 0, Verbosity.LOG_WITH_STACKTRACE);
+    return sb.toString();
+  }
+
+  public void print(StringBuilder sb, int indent, Verbosity verbosity) {
+    BaseAllocator.indent(sb, indent).append(toString());
+
+    if (BaseAllocator.DEBUG && !isEmpty && verbosity.includeHistoricalLog) {
+      sb.append("\n");
+      historicalLog.buildHistory(sb, indent + 1, verbosity.includeStackTraces);
+    }
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/io/netty/buffer/ExpandableByteBuf.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/io/netty/buffer/ExpandableByteBuf.java b/java/memory/src/main/java/io/netty/buffer/ExpandableByteBuf.java
new file mode 100644
index 0000000..5988647
--- /dev/null
+++ b/java/memory/src/main/java/io/netty/buffer/ExpandableByteBuf.java
@@ -0,0 +1,55 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package io.netty.buffer;
+
+import org.apache.arrow.memory.BufferAllocator;
+
+/**
+ * Allows us to decorate DrillBuf to make it expandable so that we can use them in the context of the Netty framework
+ * (thus supporting RPC level memory accounting).
+ */
+public class ExpandableByteBuf extends MutableWrappedByteBuf {
+
+  private final BufferAllocator allocator;
+
+  public ExpandableByteBuf(ByteBuf buffer, BufferAllocator allocator) {
+    super(buffer);
+    this.allocator = allocator;
+  }
+
+  @Override
+  public ByteBuf copy(int index, int length) {
+    return new ExpandableByteBuf(buffer.copy(index, length), allocator);
+  }
+
+  @Override
+  public ByteBuf capacity(int newCapacity) {
+    if (newCapacity > capacity()) {
+      ByteBuf newBuf = allocator.buffer(newCapacity);
+      newBuf.writeBytes(buffer, 0, buffer.capacity());
+      newBuf.readerIndex(buffer.readerIndex());
+      newBuf.writerIndex(buffer.writerIndex());
+      buffer.release();
+      buffer = newBuf;
+      return newBuf;
+    } else {
+      return super.capacity(newCapacity);
+    }
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/io/netty/buffer/LargeBuffer.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/io/netty/buffer/LargeBuffer.java b/java/memory/src/main/java/io/netty/buffer/LargeBuffer.java
new file mode 100644
index 0000000..5f5e904
--- /dev/null
+++ b/java/memory/src/main/java/io/netty/buffer/LargeBuffer.java
@@ -0,0 +1,59 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package io.netty.buffer;
+
+import java.util.concurrent.atomic.AtomicLong;
+
+/**
+ * A MutableWrappedByteBuf that also maintains a metric of the number of huge buffer bytes and counts.
+ */
+public class LargeBuffer extends MutableWrappedByteBuf {
+
+  private final AtomicLong hugeBufferSize;
+  private final AtomicLong hugeBufferCount;
+
+  private final int initCap;
+
+  public LargeBuffer(ByteBuf buffer, AtomicLong hugeBufferSize, AtomicLong hugeBufferCount) {
+    super(buffer);
+    initCap = buffer.capacity();
+    this.hugeBufferCount = hugeBufferCount;
+    this.hugeBufferSize = hugeBufferSize;
+  }
+
+  @Override
+  public ByteBuf copy(int index, int length) {
+    return new LargeBuffer(buffer.copy(index, length), hugeBufferSize, hugeBufferCount);
+  }
+
+  @Override
+  public boolean release() {
+    return release(1);
+  }
+
+  @Override
+  public boolean release(int decrement) {
+    boolean released = unwrap().release(decrement);
+    if (released) {
+      hugeBufferSize.addAndGet(-initCap);
+      hugeBufferCount.decrementAndGet();
+    }
+    return released;
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/io/netty/buffer/MutableWrappedByteBuf.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/io/netty/buffer/MutableWrappedByteBuf.java b/java/memory/src/main/java/io/netty/buffer/MutableWrappedByteBuf.java
new file mode 100644
index 0000000..5709473
--- /dev/null
+++ b/java/memory/src/main/java/io/netty/buffer/MutableWrappedByteBuf.java
@@ -0,0 +1,336 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package io.netty.buffer;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.nio.channels.GatheringByteChannel;
+import java.nio.channels.ScatteringByteChannel;
+
+/**
+ * This is basically a complete copy of DuplicatedByteBuf. We copy because we want to override some behaviors and make
+ * buffer mutable.
+ */
+abstract class MutableWrappedByteBuf extends AbstractByteBuf {
+
+  @Override
+  public ByteBuffer nioBuffer(int index, int length) {
+    return unwrap().nioBuffer(index, length);
+  }
+
+  ByteBuf buffer;
+
+  public MutableWrappedByteBuf(ByteBuf buffer) {
+    super(buffer.maxCapacity());
+
+    if (buffer instanceof MutableWrappedByteBuf) {
+      this.buffer = ((MutableWrappedByteBuf) buffer).buffer;
+    } else {
+      this.buffer = buffer;
+    }
+
+    setIndex(buffer.readerIndex(), buffer.writerIndex());
+  }
+
+  @Override
+  public ByteBuf unwrap() {
+    return buffer;
+  }
+
+  @Override
+  public ByteBufAllocator alloc() {
+    return buffer.alloc();
+  }
+
+  @Override
+  public ByteOrder order() {
+    return buffer.order();
+  }
+
+  @Override
+  public boolean isDirect() {
+    return buffer.isDirect();
+  }
+
+  @Override
+  public int capacity() {
+    return buffer.capacity();
+  }
+
+  @Override
+  public ByteBuf capacity(int newCapacity) {
+    buffer.capacity(newCapacity);
+    return this;
+  }
+
+  @Override
+  public boolean hasArray() {
+    return buffer.hasArray();
+  }
+
+  @Override
+  public byte[] array() {
+    return buffer.array();
+  }
+
+  @Override
+  public int arrayOffset() {
+    return buffer.arrayOffset();
+  }
+
+  @Override
+  public boolean hasMemoryAddress() {
+    return buffer.hasMemoryAddress();
+  }
+
+  @Override
+  public long memoryAddress() {
+    return buffer.memoryAddress();
+  }
+
+  @Override
+  public byte getByte(int index) {
+    return _getByte(index);
+  }
+
+  @Override
+  protected byte _getByte(int index) {
+    return buffer.getByte(index);
+  }
+
+  @Override
+  public short getShort(int index) {
+    return _getShort(index);
+  }
+
+  @Override
+  protected short _getShort(int index) {
+    return buffer.getShort(index);
+  }
+
+  @Override
+  public int getUnsignedMedium(int index) {
+    return _getUnsignedMedium(index);
+  }
+
+  @Override
+  protected int _getUnsignedMedium(int index) {
+    return buffer.getUnsignedMedium(index);
+  }
+
+  @Override
+  public int getInt(int index) {
+    return _getInt(index);
+  }
+
+  @Override
+  protected int _getInt(int index) {
+    return buffer.getInt(index);
+  }
+
+  @Override
+  public long getLong(int index) {
+    return _getLong(index);
+  }
+
+  @Override
+  protected long _getLong(int index) {
+    return buffer.getLong(index);
+  }
+
+  @Override
+  public abstract ByteBuf copy(int index, int length);
+
+  @Override
+  public ByteBuf slice(int index, int length) {
+    return new SlicedByteBuf(this, index, length);
+  }
+
+  @Override
+  public ByteBuf getBytes(int index, ByteBuf dst, int dstIndex, int length) {
+    buffer.getBytes(index, dst, dstIndex, length);
+    return this;
+  }
+
+  @Override
+  public ByteBuf getBytes(int index, byte[] dst, int dstIndex, int length) {
+    buffer.getBytes(index, dst, dstIndex, length);
+    return this;
+  }
+
+  @Override
+  public ByteBuf getBytes(int index, ByteBuffer dst) {
+    buffer.getBytes(index, dst);
+    return this;
+  }
+
+  @Override
+  public ByteBuf setByte(int index, int value) {
+    _setByte(index, value);
+    return this;
+  }
+
+  @Override
+  protected void _setByte(int index, int value) {
+    buffer.setByte(index, value);
+  }
+
+  @Override
+  public ByteBuf setShort(int index, int value) {
+    _setShort(index, value);
+    return this;
+  }
+
+  @Override
+  protected void _setShort(int index, int value) {
+    buffer.setShort(index, value);
+  }
+
+  @Override
+  public ByteBuf setMedium(int index, int value) {
+    _setMedium(index, value);
+    return this;
+  }
+
+  @Override
+  protected void _setMedium(int index, int value) {
+    buffer.setMedium(index, value);
+  }
+
+  @Override
+  public ByteBuf setInt(int index, int value) {
+    _setInt(index, value);
+    return this;
+  }
+
+  @Override
+  protected void _setInt(int index, int value) {
+    buffer.setInt(index, value);
+  }
+
+  @Override
+  public ByteBuf setLong(int index, long value) {
+    _setLong(index, value);
+    return this;
+  }
+
+  @Override
+  protected void _setLong(int index, long value) {
+    buffer.setLong(index, value);
+  }
+
+  @Override
+  public ByteBuf setBytes(int index, byte[] src, int srcIndex, int length) {
+    buffer.setBytes(index, src, srcIndex, length);
+    return this;
+  }
+
+  @Override
+  public ByteBuf setBytes(int index, ByteBuf src, int srcIndex, int length) {
+    buffer.setBytes(index, src, srcIndex, length);
+    return this;
+  }
+
+  @Override
+  public ByteBuf setBytes(int index, ByteBuffer src) {
+    buffer.setBytes(index, src);
+    return this;
+  }
+
+  @Override
+  public ByteBuf getBytes(int index, OutputStream out, int length)
+      throws IOException {
+    buffer.getBytes(index, out, length);
+    return this;
+  }
+
+  @Override
+  public int getBytes(int index, GatheringByteChannel out, int length)
+      throws IOException {
+    return buffer.getBytes(index, out, length);
+  }
+
+  @Override
+  public int setBytes(int index, InputStream in, int length)
+      throws IOException {
+    return buffer.setBytes(index, in, length);
+  }
+
+  @Override
+  public int setBytes(int index, ScatteringByteChannel in, int length)
+      throws IOException {
+    return buffer.setBytes(index, in, length);
+  }
+
+  @Override
+  public int nioBufferCount() {
+    return buffer.nioBufferCount();
+  }
+
+  @Override
+  public ByteBuffer[] nioBuffers(int index, int length) {
+    return buffer.nioBuffers(index, length);
+  }
+
+  @Override
+  public ByteBuffer internalNioBuffer(int index, int length) {
+    return nioBuffer(index, length);
+  }
+
+  @Override
+  public int forEachByte(int index, int length, ByteBufProcessor processor) {
+    return buffer.forEachByte(index, length, processor);
+  }
+
+  @Override
+  public int forEachByteDesc(int index, int length, ByteBufProcessor processor) {
+    return buffer.forEachByteDesc(index, length, processor);
+  }
+
+  @Override
+  public final int refCnt() {
+    return unwrap().refCnt();
+  }
+
+  @Override
+  public final ByteBuf retain() {
+    unwrap().retain();
+    return this;
+  }
+
+  @Override
+  public final ByteBuf retain(int increment) {
+    unwrap().retain(increment);
+    return this;
+  }
+
+  @Override
+  public boolean release() {
+    return release(1);
+  }
+
+  @Override
+  public boolean release(int decrement) {
+    boolean released = unwrap().release(decrement);
+    return released;
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/io/netty/buffer/PooledByteBufAllocatorL.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/io/netty/buffer/PooledByteBufAllocatorL.java b/java/memory/src/main/java/io/netty/buffer/PooledByteBufAllocatorL.java
new file mode 100644
index 0000000..1610028
--- /dev/null
+++ b/java/memory/src/main/java/io/netty/buffer/PooledByteBufAllocatorL.java
@@ -0,0 +1,272 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package io.netty.buffer;
+
+import io.netty.util.internal.StringUtil;
+
+import java.lang.reflect.Field;
+import java.nio.ByteBuffer;
+import java.util.concurrent.atomic.AtomicLong;
+
+import org.apache.arrow.memory.OutOfMemoryException;
+
+import com.codahale.metrics.Gauge;
+import com.codahale.metrics.Histogram;
+import com.codahale.metrics.Metric;
+import com.codahale.metrics.MetricFilter;
+import com.codahale.metrics.MetricRegistry;
+
+/**
+ * The base allocator that we use for all of Drill's memory management. Returns UnsafeDirectLittleEndian buffers.
+ */
+public class PooledByteBufAllocatorL {
+  private static final org.slf4j.Logger memoryLogger = org.slf4j.LoggerFactory.getLogger("drill.allocator");
+
+  private static final int MEMORY_LOGGER_FREQUENCY_SECONDS = 60;
+
+
+  public static final String METRIC_PREFIX = "drill.allocator.";
+
+  private final MetricRegistry registry;
+  private final AtomicLong hugeBufferSize = new AtomicLong(0);
+  private final AtomicLong hugeBufferCount = new AtomicLong(0);
+  private final AtomicLong normalBufferSize = new AtomicLong(0);
+  private final AtomicLong normalBufferCount = new AtomicLong(0);
+
+  private final InnerAllocator allocator;
+  public final UnsafeDirectLittleEndian empty;
+
+  public PooledByteBufAllocatorL(MetricRegistry registry) {
+    this.registry = registry;
+    allocator = new InnerAllocator();
+    empty = new UnsafeDirectLittleEndian(new DuplicatedByteBuf(Unpooled.EMPTY_BUFFER));
+  }
+
+  public UnsafeDirectLittleEndian allocate(int size) {
+    try {
+      return allocator.directBuffer(size, Integer.MAX_VALUE);
+    } catch (OutOfMemoryError e) {
+      throw new OutOfMemoryException("Failure allocating buffer.", e);
+    }
+
+  }
+
+  public int getChunkSize() {
+    return allocator.chunkSize;
+  }
+
+  private class InnerAllocator extends PooledByteBufAllocator {
+
+
+    private final PoolArena<ByteBuffer>[] directArenas;
+    private final MemoryStatusThread statusThread;
+    private final Histogram largeBuffersHist;
+    private final Histogram normalBuffersHist;
+    private final int chunkSize;
+
+    public InnerAllocator() {
+      super(true);
+
+      try {
+        Field f = PooledByteBufAllocator.class.getDeclaredField("directArenas");
+        f.setAccessible(true);
+        this.directArenas = (PoolArena<ByteBuffer>[]) f.get(this);
+      } catch (Exception e) {
+        throw new RuntimeException("Failure while initializing allocator.  Unable to retrieve direct arenas field.", e);
+      }
+
+      this.chunkSize = directArenas[0].chunkSize;
+
+      if (memoryLogger.isTraceEnabled()) {
+        statusThread = new MemoryStatusThread();
+        statusThread.start();
+      } else {
+        statusThread = null;
+      }
+      removeOldMetrics();
+
+      registry.register(METRIC_PREFIX + "normal.size", new Gauge<Long>() {
+        @Override
+        public Long getValue() {
+          return normalBufferSize.get();
+        }
+      });
+
+      registry.register(METRIC_PREFIX + "normal.count", new Gauge<Long>() {
+        @Override
+        public Long getValue() {
+          return normalBufferCount.get();
+        }
+      });
+
+      registry.register(METRIC_PREFIX + "huge.size", new Gauge<Long>() {
+        @Override
+        public Long getValue() {
+          return hugeBufferSize.get();
+        }
+      });
+
+      registry.register(METRIC_PREFIX + "huge.count", new Gauge<Long>() {
+        @Override
+        public Long getValue() {
+          return hugeBufferCount.get();
+        }
+      });
+
+      largeBuffersHist = registry.histogram(METRIC_PREFIX + "huge.hist");
+      normalBuffersHist = registry.histogram(METRIC_PREFIX + "normal.hist");
+
+    }
+
+
+    private synchronized void removeOldMetrics() {
+      registry.removeMatching(new MetricFilter() {
+        @Override
+        public boolean matches(String name, Metric metric) {
+          return name.startsWith("drill.allocator.");
+        }
+
+      });
+    }
+
+    private UnsafeDirectLittleEndian newDirectBufferL(int initialCapacity, int maxCapacity) {
+      PoolThreadCache cache = threadCache.get();
+      PoolArena<ByteBuffer> directArena = cache.directArena;
+
+      if (directArena != null) {
+
+        if (initialCapacity > directArena.chunkSize) {
+          // This is beyond chunk size so we'll allocate separately.
+          ByteBuf buf = UnpooledByteBufAllocator.DEFAULT.directBuffer(initialCapacity, maxCapacity);
+
+          hugeBufferCount.incrementAndGet();
+          hugeBufferSize.addAndGet(buf.capacity());
+          largeBuffersHist.update(buf.capacity());
+          // logger.debug("Allocating huge buffer of size {}", initialCapacity, new Exception());
+          return new UnsafeDirectLittleEndian(new LargeBuffer(buf, hugeBufferSize, hugeBufferCount));
+
+        } else {
+          // within chunk, use arena.
+          ByteBuf buf = directArena.allocate(cache, initialCapacity, maxCapacity);
+          if (!(buf instanceof PooledUnsafeDirectByteBuf)) {
+            fail();
+          }
+
+          normalBuffersHist.update(buf.capacity());
+          if (ASSERT_ENABLED) {
+            normalBufferSize.addAndGet(buf.capacity());
+            normalBufferCount.incrementAndGet();
+          }
+
+          return new UnsafeDirectLittleEndian((PooledUnsafeDirectByteBuf) buf, normalBufferCount,
+              normalBufferSize);
+        }
+
+      } else {
+        throw fail();
+      }
+    }
+
+    private UnsupportedOperationException fail() {
+      return new UnsupportedOperationException(
+          "Drill requries that the JVM used supports access sun.misc.Unsafe.  This platform didn't provide that functionality.");
+    }
+
+    public UnsafeDirectLittleEndian directBuffer(int initialCapacity, int maxCapacity) {
+      if (initialCapacity == 0 && maxCapacity == 0) {
+        newDirectBuffer(initialCapacity, maxCapacity);
+      }
+      validate(initialCapacity, maxCapacity);
+      return newDirectBufferL(initialCapacity, maxCapacity);
+    }
+
+    @Override
+    public ByteBuf heapBuffer(int initialCapacity, int maxCapacity) {
+      throw new UnsupportedOperationException("Drill doesn't support using heap buffers.");
+    }
+
+
+    private void validate(int initialCapacity, int maxCapacity) {
+      if (initialCapacity < 0) {
+        throw new IllegalArgumentException("initialCapacity: " + initialCapacity + " (expectd: 0+)");
+      }
+      if (initialCapacity > maxCapacity) {
+        throw new IllegalArgumentException(String.format(
+            "initialCapacity: %d (expected: not greater than maxCapacity(%d)",
+            initialCapacity, maxCapacity));
+      }
+    }
+
+    private class MemoryStatusThread extends Thread {
+
+      public MemoryStatusThread() {
+        super("memory-status-logger");
+        this.setDaemon(true);
+        this.setName("allocation.logger");
+      }
+
+      @Override
+      public void run() {
+        while (true) {
+          memoryLogger.trace("Memory Usage: \n{}", PooledByteBufAllocatorL.this.toString());
+          try {
+            Thread.sleep(MEMORY_LOGGER_FREQUENCY_SECONDS * 1000);
+          } catch (InterruptedException e) {
+            return;
+          }
+
+        }
+      }
+
+    }
+
+    public String toString() {
+      StringBuilder buf = new StringBuilder();
+      buf.append(directArenas.length);
+      buf.append(" direct arena(s):");
+      buf.append(StringUtil.NEWLINE);
+      for (PoolArena<ByteBuffer> a : directArenas) {
+        buf.append(a);
+      }
+
+      buf.append("Large buffers outstanding: ");
+      buf.append(hugeBufferCount.get());
+      buf.append(" totaling ");
+      buf.append(hugeBufferSize.get());
+      buf.append(" bytes.");
+      buf.append('\n');
+      buf.append("Normal buffers outstanding: ");
+      buf.append(normalBufferCount.get());
+      buf.append(" totaling ");
+      buf.append(normalBufferSize.get());
+      buf.append(" bytes.");
+      return buf.toString();
+    }
+
+
+  }
+
+  public static final boolean ASSERT_ENABLED;
+
+  static {
+    boolean isAssertEnabled = false;
+    assert isAssertEnabled = true;
+    ASSERT_ENABLED = isAssertEnabled;
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/io/netty/buffer/UnsafeDirectLittleEndian.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/io/netty/buffer/UnsafeDirectLittleEndian.java b/java/memory/src/main/java/io/netty/buffer/UnsafeDirectLittleEndian.java
new file mode 100644
index 0000000..6495d5d
--- /dev/null
+++ b/java/memory/src/main/java/io/netty/buffer/UnsafeDirectLittleEndian.java
@@ -0,0 +1,270 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package io.netty.buffer;
+
+import io.netty.util.internal.PlatformDependent;
+
+import java.nio.ByteOrder;
+import java.util.concurrent.atomic.AtomicLong;
+
+/**
+ * The underlying class we use for little-endian access to memory. Is used underneath DrillBufs to abstract away the
+ * Netty classes and underlying Netty memory management.
+ */
+public final class UnsafeDirectLittleEndian extends WrappedByteBuf {
+  private static final boolean NATIVE_ORDER = ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN;
+  private static final AtomicLong ID_GENERATOR = new AtomicLong(0);
+
+  public final long id = ID_GENERATOR.incrementAndGet();
+  private final AbstractByteBuf wrapped;
+  private final long memoryAddress;
+
+  private final AtomicLong bufferCount;
+  private final AtomicLong bufferSize;
+  private final long initCap;
+
+  UnsafeDirectLittleEndian(DuplicatedByteBuf buf) {
+    this(buf, true, null, null);
+  }
+
+  UnsafeDirectLittleEndian(LargeBuffer buf) {
+    this(buf, true, null, null);
+  }
+
+  UnsafeDirectLittleEndian(PooledUnsafeDirectByteBuf buf, AtomicLong bufferCount, AtomicLong bufferSize) {
+    this(buf, true, bufferCount, bufferSize);
+
+  }
+
+  private UnsafeDirectLittleEndian(AbstractByteBuf buf, boolean fake, AtomicLong bufferCount, AtomicLong bufferSize) {
+    super(buf);
+    if (!NATIVE_ORDER || buf.order() != ByteOrder.BIG_ENDIAN) {
+      throw new IllegalStateException("Drill only runs on LittleEndian systems.");
+    }
+
+    this.bufferCount = bufferCount;
+    this.bufferSize = bufferSize;
+
+    // initCap is used if we're tracking memory release. If we're in non-debug mode, we'll skip this.
+    this.initCap = ASSERT_ENABLED ? buf.capacity() : -1;
+
+    this.wrapped = buf;
+    this.memoryAddress = buf.memoryAddress();
+  }
+    private long addr(int index) {
+        return memoryAddress + index;
+    }
+
+    @Override
+    public long getLong(int index) {
+//        wrapped.checkIndex(index, 8);
+        long v = PlatformDependent.getLong(addr(index));
+        return v;
+    }
+
+    @Override
+    public float getFloat(int index) {
+        return Float.intBitsToFloat(getInt(index));
+    }
+
+  @Override
+  public ByteBuf slice() {
+    return slice(this.readerIndex(), readableBytes());
+  }
+
+  @Override
+  public ByteBuf slice(int index, int length) {
+    return new SlicedByteBuf(this, index, length);
+  }
+
+  @Override
+  public ByteOrder order() {
+    return ByteOrder.LITTLE_ENDIAN;
+  }
+
+  @Override
+  public ByteBuf order(ByteOrder endianness) {
+    return this;
+  }
+
+  @Override
+  public double getDouble(int index) {
+    return Double.longBitsToDouble(getLong(index));
+  }
+
+  @Override
+  public char getChar(int index) {
+    return (char) getShort(index);
+  }
+
+  @Override
+  public long getUnsignedInt(int index) {
+    return getInt(index) & 0xFFFFFFFFL;
+  }
+
+  @Override
+  public int getInt(int index) {
+    int v = PlatformDependent.getInt(addr(index));
+    return v;
+  }
+
+  @Override
+  public int getUnsignedShort(int index) {
+    return getShort(index) & 0xFFFF;
+  }
+
+  @Override
+  public short getShort(int index) {
+    short v = PlatformDependent.getShort(addr(index));
+    return v;
+  }
+
+  @Override
+  public ByteBuf setShort(int index, int value) {
+    wrapped.checkIndex(index, 2);
+    _setShort(index, value);
+    return this;
+  }
+
+  @Override
+  public ByteBuf setInt(int index, int value) {
+    wrapped.checkIndex(index, 4);
+    _setInt(index, value);
+    return this;
+  }
+
+  @Override
+  public ByteBuf setLong(int index, long value) {
+    wrapped.checkIndex(index, 8);
+    _setLong(index, value);
+    return this;
+  }
+
+  @Override
+  public ByteBuf setChar(int index, int value) {
+    setShort(index, value);
+    return this;
+  }
+
+  @Override
+  public ByteBuf setFloat(int index, float value) {
+    setInt(index, Float.floatToRawIntBits(value));
+    return this;
+  }
+
+  @Override
+  public ByteBuf setDouble(int index, double value) {
+    setLong(index, Double.doubleToRawLongBits(value));
+    return this;
+  }
+
+  @Override
+  public ByteBuf writeShort(int value) {
+    wrapped.ensureWritable(2);
+    _setShort(wrapped.writerIndex, value);
+    wrapped.writerIndex += 2;
+    return this;
+  }
+
+  @Override
+  public ByteBuf writeInt(int value) {
+    wrapped.ensureWritable(4);
+    _setInt(wrapped.writerIndex, value);
+    wrapped.writerIndex += 4;
+    return this;
+  }
+
+  @Override
+  public ByteBuf writeLong(long value) {
+    wrapped.ensureWritable(8);
+    _setLong(wrapped.writerIndex, value);
+    wrapped.writerIndex += 8;
+    return this;
+  }
+
+  @Override
+  public ByteBuf writeChar(int value) {
+    writeShort(value);
+    return this;
+  }
+
+  @Override
+  public ByteBuf writeFloat(float value) {
+    writeInt(Float.floatToRawIntBits(value));
+    return this;
+  }
+
+  @Override
+  public ByteBuf writeDouble(double value) {
+    writeLong(Double.doubleToRawLongBits(value));
+    return this;
+  }
+
+  private void _setShort(int index, int value) {
+    PlatformDependent.putShort(addr(index), (short) value);
+  }
+
+  private void _setInt(int index, int value) {
+    PlatformDependent.putInt(addr(index), value);
+  }
+
+  private void _setLong(int index, long value) {
+    PlatformDependent.putLong(addr(index), value);
+  }
+
+  @Override
+  public byte getByte(int index) {
+    return PlatformDependent.getByte(addr(index));
+  }
+
+  @Override
+  public ByteBuf setByte(int index, int value) {
+    PlatformDependent.putByte(addr(index), (byte) value);
+    return this;
+  }
+
+  @Override
+  public boolean release() {
+    return release(1);
+  }
+
+  @Override
+  public boolean release(int decrement) {
+    final boolean released = super.release(decrement);
+    if (ASSERT_ENABLED && released && bufferCount != null && bufferSize != null) {
+      bufferCount.decrementAndGet();
+      bufferSize.addAndGet(-initCap);
+    }
+    return released;
+  }
+
+  @Override
+  public int hashCode() {
+    return System.identityHashCode(this);
+  }
+
+  public static final boolean ASSERT_ENABLED;
+
+  static {
+    boolean isAssertEnabled = false;
+    assert isAssertEnabled = true;
+    ASSERT_ENABLED = isAssertEnabled;
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/Accountant.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/Accountant.java b/java/memory/src/main/java/org/apache/arrow/memory/Accountant.java
new file mode 100644
index 0000000..dc75e5d
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/Accountant.java
@@ -0,0 +1,272 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory;
+
+import java.util.concurrent.atomic.AtomicLong;
+
+import javax.annotation.concurrent.ThreadSafe;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * Provides a concurrent way to manage account for memory usage without locking. Used as basis for Allocators. All
+ * operations are threadsafe (except for close).
+ */
+@ThreadSafe
+class Accountant implements AutoCloseable {
+  // private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(Accountant.class);
+
+  /**
+   * The parent allocator
+   */
+  protected final Accountant parent;
+
+  /**
+   * The amount of memory reserved for this allocator. Releases below this amount of memory will not be returned to the
+   * parent Accountant until this Accountant is closed.
+   */
+  protected final long reservation;
+
+  private final AtomicLong peakAllocation = new AtomicLong();
+
+  /**
+   * Maximum local memory that can be held. This can be externally updated. Changing it won't cause past memory to
+   * change but will change responses to future allocation efforts
+   */
+  private final AtomicLong allocationLimit = new AtomicLong();
+
+  /**
+   * Currently allocated amount of memory;
+   */
+  private final AtomicLong locallyHeldMemory = new AtomicLong();
+
+  public Accountant(Accountant parent, long reservation, long maxAllocation) {
+    Preconditions.checkArgument(reservation >= 0, "The initial reservation size must be non-negative.");
+    Preconditions.checkArgument(maxAllocation >= 0, "The maximum allocation limit must be non-negative.");
+    Preconditions.checkArgument(reservation <= maxAllocation,
+        "The initial reservation size must be <= the maximum allocation.");
+    Preconditions.checkArgument(reservation == 0 || parent != null, "The root accountant can't reserve memory.");
+
+    this.parent = parent;
+    this.reservation = reservation;
+    this.allocationLimit.set(maxAllocation);
+
+    if (reservation != 0) {
+      // we will allocate a reservation from our parent.
+      final AllocationOutcome outcome = parent.allocateBytes(reservation);
+      if (!outcome.isOk()) {
+        throw new OutOfMemoryException(String.format(
+            "Failure trying to allocate initial reservation for Allocator. "
+                + "Attempted to allocate %d bytes and received an outcome of %s.", reservation, outcome.name()));
+      }
+    }
+  }
+
+  /**
+   * Attempt to allocate the requested amount of memory. Either completely succeeds or completely fails. Constructs a a
+   * log of delta
+   *
+   * If it fails, no changes are made to accounting.
+   *
+   * @param size
+   *          The amount of memory to reserve in bytes.
+   * @return True if the allocation was successful, false if the allocation failed.
+   */
+  AllocationOutcome allocateBytes(long size) {
+    final AllocationOutcome outcome = allocate(size, true, false);
+    if (!outcome.isOk()) {
+      releaseBytes(size);
+    }
+    return outcome;
+  }
+
+  private void updatePeak() {
+    final long currentMemory = locallyHeldMemory.get();
+    while (true) {
+
+      final long previousPeak = peakAllocation.get();
+      if (currentMemory > previousPeak) {
+        if (!peakAllocation.compareAndSet(previousPeak, currentMemory)) {
+          // peak allocation changed underneath us. try again.
+          continue;
+        }
+      }
+
+      // we either succeeded to set peak allocation or we weren't above the previous peak, exit.
+      return;
+    }
+  }
+
+
+  /**
+   * Increase the accounting. Returns whether the allocation fit within limits.
+   *
+   * @param size
+   *          to increase
+   * @return Whether the allocation fit within limits.
+   */
+  boolean forceAllocate(long size) {
+    final AllocationOutcome outcome = allocate(size, true, true);
+    return outcome.isOk();
+  }
+
+  /**
+   * Internal method for allocation. This takes a forced approach to allocation to ensure that we manage reservation
+   * boundary issues consistently. Allocation is always done through the entire tree. The two options that we influence
+   * are whether the allocation should be forced and whether or not the peak memory allocation should be updated. If at
+   * some point during allocation escalation we determine that the allocation is no longer possible, we will continue to
+   * do a complete and consistent allocation but we will stop updating the peak allocation. We do this because we know
+   * that we will be directly unwinding this allocation (and thus never actually making the allocation). If force
+   * allocation is passed, then we continue to update the peak limits since we now know that this allocation will occur
+   * despite our moving past one or more limits.
+   *
+   * @param size
+   *          The size of the allocation.
+   * @param incomingUpdatePeak
+   *          Whether we should update the local peak for this allocation.
+   * @param forceAllocation
+   *          Whether we should force the allocation.
+   * @return The outcome of the allocation.
+   */
+  private AllocationOutcome allocate(final long size, final boolean incomingUpdatePeak, final boolean forceAllocation) {
+    final long newLocal = locallyHeldMemory.addAndGet(size);
+    final long beyondReservation = newLocal - reservation;
+    final boolean beyondLimit = newLocal > allocationLimit.get();
+    final boolean updatePeak = forceAllocation || (incomingUpdatePeak && !beyondLimit);
+
+    AllocationOutcome parentOutcome = AllocationOutcome.SUCCESS;
+    if (beyondReservation > 0 && parent != null) {
+      // we need to get memory from our parent.
+      final long parentRequest = Math.min(beyondReservation, size);
+      parentOutcome = parent.allocate(parentRequest, updatePeak, forceAllocation);
+    }
+
+    final AllocationOutcome finalOutcome = beyondLimit ? AllocationOutcome.FAILED_LOCAL :
+        parentOutcome.ok ? AllocationOutcome.SUCCESS : AllocationOutcome.FAILED_PARENT;
+
+    if (updatePeak) {
+      updatePeak();
+    }
+
+    return finalOutcome;
+  }
+
+  public void releaseBytes(long size) {
+    // reduce local memory. all memory released above reservation should be released up the tree.
+    final long newSize = locallyHeldMemory.addAndGet(-size);
+
+    Preconditions.checkArgument(newSize >= 0, "Accounted size went negative.");
+
+    final long originalSize = newSize + size;
+    if(originalSize > reservation && parent != null){
+      // we deallocated memory that we should release to our parent.
+      final long possibleAmountToReleaseToParent = originalSize - reservation;
+      final long actualToReleaseToParent = Math.min(size, possibleAmountToReleaseToParent);
+      parent.releaseBytes(actualToReleaseToParent);
+    }
+
+  }
+
+  /**
+   * Set the maximum amount of memory that can be allocated in the this Accountant before failing an allocation.
+   *
+   * @param newLimit
+   *          The limit in bytes.
+   */
+  public void setLimit(long newLimit) {
+    allocationLimit.set(newLimit);
+  }
+
+  public boolean isOverLimit() {
+    return getAllocatedMemory() > getLimit() || (parent != null && parent.isOverLimit());
+  }
+
+  /**
+   * Close this Accountant. This will release any reservation bytes back to a parent Accountant.
+   */
+  public void close() {
+    // return memory reservation to parent allocator.
+    if (parent != null) {
+      parent.releaseBytes(reservation);
+    }
+  }
+
+  /**
+   * Return the current limit of this Accountant.
+   *
+   * @return Limit in bytes.
+   */
+  public long getLimit() {
+    return allocationLimit.get();
+  }
+
+  /**
+   * Return the current amount of allocated memory that this Accountant is managing accounting for. Note this does not
+   * include reservation memory that hasn't been allocated.
+   *
+   * @return Currently allocate memory in bytes.
+   */
+  public long getAllocatedMemory() {
+    return locallyHeldMemory.get();
+  }
+
+  /**
+   * The peak memory allocated by this Accountant.
+   *
+   * @return The peak allocated memory in bytes.
+   */
+  public long getPeakMemoryAllocation() {
+    return peakAllocation.get();
+  }
+
+  /**
+   * Describes the type of outcome that occurred when trying to account for allocation of memory.
+   */
+  public static enum AllocationOutcome {
+
+    /**
+     * Allocation succeeded.
+     */
+    SUCCESS(true),
+
+    /**
+     * Allocation succeeded but only because the allocator was forced to move beyond a limit.
+     */
+    FORCED_SUCESS(true),
+
+    /**
+     * Allocation failed because the local allocator's limits were exceeded.
+     */
+    FAILED_LOCAL(false),
+
+    /**
+     * Allocation failed because a parent allocator's limits were exceeded.
+     */
+    FAILED_PARENT(false);
+
+    private final boolean ok;
+
+    AllocationOutcome(boolean ok) {
+      this.ok = ok;
+    }
+
+    public boolean isOk() {
+      return ok;
+    }
+  }
+}


[10/17] arrow git commit: ARROW-1: Initial Arrow Code Commit

Posted by ja...@apache.org.
http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/AllocationManager.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/AllocationManager.java b/java/memory/src/main/java/org/apache/arrow/memory/AllocationManager.java
new file mode 100644
index 0000000..0db6144
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/AllocationManager.java
@@ -0,0 +1,433 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory;
+
+import static org.apache.arrow.memory.BaseAllocator.indent;
+import io.netty.buffer.ArrowBuf;
+import io.netty.buffer.PooledByteBufAllocatorL;
+import io.netty.buffer.UnsafeDirectLittleEndian;
+
+import java.util.IdentityHashMap;
+import java.util.concurrent.atomic.AtomicInteger;
+import java.util.concurrent.atomic.AtomicLong;
+import java.util.concurrent.locks.ReadWriteLock;
+import java.util.concurrent.locks.ReentrantReadWriteLock;
+
+import org.apache.arrow.memory.BaseAllocator.Verbosity;
+import org.apache.arrow.memory.util.AutoCloseableLock;
+import org.apache.arrow.memory.util.HistoricalLog;
+import org.apache.arrow.memory.util.Metrics;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * Manages the relationship between one or more allocators and a particular UDLE. Ensures that one allocator owns the
+ * memory that multiple allocators may be referencing. Manages a BufferLedger between each of its associated allocators.
+ * This class is also responsible for managing when memory is allocated and returned to the Netty-based
+ * PooledByteBufAllocatorL.
+ *
+ * The only reason that this isn't package private is we're forced to put DrillBuf in Netty's package which need access
+ * to these objects or methods.
+ *
+ * Threading: AllocationManager manages thread-safety internally. Operations within the context of a single BufferLedger
+ * are lockless in nature and can be leveraged by multiple threads. Operations that cross the context of two ledgers
+ * will acquire a lock on the AllocationManager instance. Important note, there is one AllocationManager per
+ * UnsafeDirectLittleEndian buffer allocation. As such, there will be thousands of these in a typical query. The
+ * contention of acquiring a lock on AllocationManager should be very low.
+ *
+ */
+public class AllocationManager {
+  // private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(AllocationManager.class);
+
+  private static final AtomicLong MANAGER_ID_GENERATOR = new AtomicLong(0);
+  private static final AtomicLong LEDGER_ID_GENERATOR = new AtomicLong(0);
+  static final PooledByteBufAllocatorL INNER_ALLOCATOR = new PooledByteBufAllocatorL(Metrics.getInstance());
+
+  private final RootAllocator root;
+  private final long allocatorManagerId = MANAGER_ID_GENERATOR.incrementAndGet();
+  private final int size;
+  private final UnsafeDirectLittleEndian underlying;
+  private final IdentityHashMap<BufferAllocator, BufferLedger> map = new IdentityHashMap<>();
+  private final ReadWriteLock lock = new ReentrantReadWriteLock();
+  private final AutoCloseableLock readLock = new AutoCloseableLock(lock.readLock());
+  private final AutoCloseableLock writeLock = new AutoCloseableLock(lock.writeLock());
+  private final long amCreationTime = System.nanoTime();
+
+  private volatile BufferLedger owningLedger;
+  private volatile long amDestructionTime = 0;
+
+  AllocationManager(BaseAllocator accountingAllocator, int size) {
+    Preconditions.checkNotNull(accountingAllocator);
+    accountingAllocator.assertOpen();
+
+    this.root = accountingAllocator.root;
+    this.underlying = INNER_ALLOCATOR.allocate(size);
+
+    // we do a no retain association since our creator will want to retrieve the newly created ledger and will create a
+    // reference count at that point
+    this.owningLedger = associate(accountingAllocator, false);
+    this.size = underlying.capacity();
+  }
+
+  /**
+   * Associate the existing underlying buffer with a new allocator. This will increase the reference count to the
+   * provided ledger by 1.
+   * @param allocator
+   *          The target allocator to associate this buffer with.
+   * @return The Ledger (new or existing) that associates the underlying buffer to this new ledger.
+   */
+  BufferLedger associate(final BaseAllocator allocator) {
+    return associate(allocator, true);
+  }
+
+  private BufferLedger associate(final BaseAllocator allocator, final boolean retain) {
+    allocator.assertOpen();
+
+    if (root != allocator.root) {
+      throw new IllegalStateException(
+          "A buffer can only be associated between two allocators that share the same root.");
+    }
+
+    try (AutoCloseableLock read = readLock.open()) {
+
+      final BufferLedger ledger = map.get(allocator);
+      if (ledger != null) {
+        if (retain) {
+          ledger.inc();
+        }
+        return ledger;
+      }
+
+    }
+    try (AutoCloseableLock write = writeLock.open()) {
+      // we have to recheck existing ledger since a second reader => writer could be competing with us.
+
+      final BufferLedger existingLedger = map.get(allocator);
+      if (existingLedger != null) {
+        if (retain) {
+          existingLedger.inc();
+        }
+        return existingLedger;
+      }
+
+      final BufferLedger ledger = new BufferLedger(allocator, new ReleaseListener(allocator));
+      if (retain) {
+        ledger.inc();
+      }
+      BufferLedger oldLedger = map.put(allocator, ledger);
+      Preconditions.checkArgument(oldLedger == null);
+      allocator.associateLedger(ledger);
+      return ledger;
+    }
+  }
+
+
+  /**
+   * The way that a particular BufferLedger communicates back to the AllocationManager that it now longer needs to hold
+   * a reference to particular piece of memory.
+   */
+  private class ReleaseListener {
+
+    private final BufferAllocator allocator;
+
+    public ReleaseListener(BufferAllocator allocator) {
+      this.allocator = allocator;
+    }
+
+    /**
+     * Can only be called when you already hold the writeLock.
+     */
+    public void release() {
+      allocator.assertOpen();
+
+      final BufferLedger oldLedger = map.remove(allocator);
+      oldLedger.allocator.dissociateLedger(oldLedger);
+
+      if (oldLedger == owningLedger) {
+        if (map.isEmpty()) {
+          // no one else owns, lets release.
+          oldLedger.allocator.releaseBytes(size);
+          underlying.release();
+          amDestructionTime = System.nanoTime();
+          owningLedger = null;
+        } else {
+          // we need to change the owning allocator. we've been removed so we'll get whatever is top of list
+          BufferLedger newLedger = map.values().iterator().next();
+
+          // we'll forcefully transfer the ownership and not worry about whether we exceeded the limit
+          // since this consumer can't do anything with this.
+          oldLedger.transferBalance(newLedger);
+        }
+      } else {
+        if (map.isEmpty()) {
+          throw new IllegalStateException("The final removal of a ledger should be connected to the owning ledger.");
+        }
+      }
+
+
+    }
+  }
+
+  /**
+   * The reference manager that binds an allocator manager to a particular BaseAllocator. Also responsible for creating
+   * a set of DrillBufs that share a common fate and set of reference counts.
+   * As with AllocationManager, the only reason this is public is due to DrillBuf being in io.netty.buffer package.
+   */
+  public class BufferLedger {
+
+    private final IdentityHashMap<ArrowBuf, Object> buffers =
+        BaseAllocator.DEBUG ? new IdentityHashMap<ArrowBuf, Object>() : null;
+
+    private final long ledgerId = LEDGER_ID_GENERATOR.incrementAndGet(); // unique ID assigned to each ledger
+    private final AtomicInteger bufRefCnt = new AtomicInteger(0); // start at zero so we can manage request for retain
+                                                                  // correctly
+    private final long lCreationTime = System.nanoTime();
+    private volatile long lDestructionTime = 0;
+    private final BaseAllocator allocator;
+    private final ReleaseListener listener;
+    private final HistoricalLog historicalLog = BaseAllocator.DEBUG ? new HistoricalLog(BaseAllocator.DEBUG_LOG_LENGTH,
+        "BufferLedger[%d]", 1)
+        : null;
+
+    private BufferLedger(BaseAllocator allocator, ReleaseListener listener) {
+      this.allocator = allocator;
+      this.listener = listener;
+    }
+
+    /**
+     * Transfer any balance the current ledger has to the target ledger. In the case that the current ledger holds no
+     * memory, no transfer is made to the new ledger.
+     * @param target
+     *          The ledger to transfer ownership account to.
+     * @return Whether transfer fit within target ledgers limits.
+     */
+    public boolean transferBalance(final BufferLedger target) {
+      Preconditions.checkNotNull(target);
+      Preconditions.checkArgument(allocator.root == target.allocator.root,
+          "You can only transfer between two allocators that share the same root.");
+      allocator.assertOpen();
+
+      target.allocator.assertOpen();
+      // if we're transferring to ourself, just return.
+      if (target == this) {
+        return true;
+      }
+
+      // since two balance transfers out from the allocator manager could cause incorrect accounting, we need to ensure
+      // that this won't happen by synchronizing on the allocator manager instance.
+      try (AutoCloseableLock write = writeLock.open()) {
+        if (owningLedger != this) {
+          return true;
+        }
+
+        if (BaseAllocator.DEBUG) {
+          this.historicalLog.recordEvent("transferBalance(%s)", target.allocator.name);
+          target.historicalLog.recordEvent("incoming(from %s)", owningLedger.allocator.name);
+        }
+
+        boolean overlimit = target.allocator.forceAllocate(size);
+        allocator.releaseBytes(size);
+        owningLedger = target;
+        return overlimit;
+      }
+
+    }
+
+    /**
+     * Print the current ledger state to a the provided StringBuilder.
+     * @param sb
+     *          The StringBuilder to populate.
+     * @param indent
+     *          The level of indentation to position the data.
+     * @param verbosity
+     *          The level of verbosity to print.
+     */
+    public void print(StringBuilder sb, int indent, Verbosity verbosity) {
+      indent(sb, indent)
+          .append("ledger[")
+          .append(ledgerId)
+          .append("] allocator: ")
+          .append(allocator.name)
+          .append("), isOwning: ")
+          .append(owningLedger == this)
+          .append(", size: ")
+          .append(size)
+          .append(", references: ")
+          .append(bufRefCnt.get())
+          .append(", life: ")
+          .append(lCreationTime)
+          .append("..")
+          .append(lDestructionTime)
+          .append(", allocatorManager: [")
+          .append(AllocationManager.this.allocatorManagerId)
+          .append(", life: ")
+          .append(amCreationTime)
+          .append("..")
+          .append(amDestructionTime);
+
+      if (!BaseAllocator.DEBUG) {
+        sb.append("]\n");
+      } else {
+        synchronized (buffers) {
+          sb.append("] holds ")
+              .append(buffers.size())
+              .append(" buffers. \n");
+          for (ArrowBuf buf : buffers.keySet()) {
+            buf.print(sb, indent + 2, verbosity);
+            sb.append('\n');
+          }
+        }
+      }
+
+    }
+
+    private void inc() {
+      bufRefCnt.incrementAndGet();
+    }
+
+    /**
+     * Decrement the ledger's reference count. If the ledger is decremented to zero, this ledger should release its
+     * ownership back to the AllocationManager
+     */
+    public int decrement(int decrement) {
+      allocator.assertOpen();
+
+      final int outcome;
+      try (AutoCloseableLock write = writeLock.open()) {
+        outcome = bufRefCnt.addAndGet(-decrement);
+        if (outcome == 0) {
+          lDestructionTime = System.nanoTime();
+          listener.release();
+        }
+      }
+
+      return outcome;
+    }
+
+    /**
+     * Returns the ledger associated with a particular BufferAllocator. If the BufferAllocator doesn't currently have a
+     * ledger associated with this AllocationManager, a new one is created. This is placed on BufferLedger rather than
+     * AllocationManager directly because DrillBufs don't have access to AllocationManager and they are the ones
+     * responsible for exposing the ability to associate multiple allocators with a particular piece of underlying
+     * memory. Note that this will increment the reference count of this ledger by one to ensure the ledger isn't
+     * destroyed before use.
+     *
+     * @param allocator
+     * @return
+     */
+    public BufferLedger getLedgerForAllocator(BufferAllocator allocator) {
+      return associate((BaseAllocator) allocator);
+    }
+
+    /**
+     * Create a new DrillBuf associated with this AllocationManager and memory. Does not impact reference count.
+     * Typically used for slicing.
+     * @param offset
+     *          The offset in bytes to start this new DrillBuf.
+     * @param length
+     *          The length in bytes that this DrillBuf will provide access to.
+     * @return A new DrillBuf that shares references with all DrillBufs associated with this BufferLedger
+     */
+    public ArrowBuf newDrillBuf(int offset, int length) {
+      allocator.assertOpen();
+      return newDrillBuf(offset, length, null);
+    }
+
+    /**
+     * Create a new DrillBuf associated with this AllocationManager and memory.
+     * @param offset
+     *          The offset in bytes to start this new DrillBuf.
+     * @param length
+     *          The length in bytes that this DrillBuf will provide access to.
+     * @param manager
+     *          An optional BufferManager argument that can be used to manage expansion of this DrillBuf
+     * @param retain
+     *          Whether or not the newly created buffer should get an additional reference count added to it.
+     * @return A new DrillBuf that shares references with all DrillBufs associated with this BufferLedger
+     */
+    public ArrowBuf newDrillBuf(int offset, int length, BufferManager manager) {
+      allocator.assertOpen();
+
+      final ArrowBuf buf = new ArrowBuf(
+          bufRefCnt,
+          this,
+          underlying,
+          manager,
+          allocator.getAsByteBufAllocator(),
+          offset,
+          length,
+          false);
+
+      if (BaseAllocator.DEBUG) {
+        historicalLog.recordEvent(
+            "DrillBuf(BufferLedger, BufferAllocator[%s], UnsafeDirectLittleEndian[identityHashCode == "
+                + "%d](%s)) => ledger hc == %d",
+            allocator.name, System.identityHashCode(buf), buf.toString(),
+            System.identityHashCode(this));
+
+        synchronized (buffers) {
+          buffers.put(buf, null);
+        }
+      }
+
+      return buf;
+
+    }
+
+    /**
+     * What is the total size (in bytes) of memory underlying this ledger.
+     *
+     * @return Size in bytes
+     */
+    public int getSize() {
+      return size;
+    }
+
+    /**
+     * How much memory is accounted for by this ledger. This is either getSize() if this is the owning ledger for the
+     * memory or zero in the case that this is not the owning ledger associated with this memory.
+     *
+     * @return Amount of accounted(owned) memory associated with this ledger.
+     */
+    public int getAccountedSize() {
+      try (AutoCloseableLock read = readLock.open()) {
+        if (owningLedger == this) {
+          return size;
+        } else {
+          return 0;
+        }
+      }
+    }
+
+    /**
+     * Package visible for debugging/verification only.
+     */
+    UnsafeDirectLittleEndian getUnderlying() {
+      return underlying;
+    }
+
+    /**
+     * Package visible for debugging/verification only.
+     */
+    boolean isOwningLedger() {
+      return this == owningLedger;
+    }
+
+  }
+
+}
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/AllocationReservation.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/AllocationReservation.java b/java/memory/src/main/java/org/apache/arrow/memory/AllocationReservation.java
new file mode 100644
index 0000000..68d1244
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/AllocationReservation.java
@@ -0,0 +1,86 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory;
+
+import io.netty.buffer.ArrowBuf;
+
+/**
+ * Supports cumulative allocation reservation. Clients may increase the size of the reservation repeatedly until they
+ * call for an allocation of the current total size. The reservation can only be used once, and will throw an exception
+ * if it is used more than once.
+ * <p>
+ * For the purposes of airtight memory accounting, the reservation must be close()d whether it is used or not.
+ * This is not threadsafe.
+ */
+public interface AllocationReservation extends AutoCloseable {
+
+  /**
+   * Add to the current reservation.
+   *
+   * <p>Adding may fail if the allocator is not allowed to consume any more space.
+   *
+   * @param nBytes the number of bytes to add
+   * @return true if the addition is possible, false otherwise
+   * @throws IllegalStateException if called after buffer() is used to allocate the reservation
+   */
+  boolean add(final int nBytes);
+
+  /**
+   * Requests a reservation of additional space.
+   *
+   * <p>The implementation of the allocator's inner class provides this.
+   *
+   * @param nBytes the amount to reserve
+   * @return true if the reservation can be satisfied, false otherwise
+   */
+  boolean reserve(int nBytes);
+
+  /**
+   * Allocate a buffer whose size is the total of all the add()s made.
+   *
+   * <p>The allocation request can still fail, even if the amount of space
+   * requested is available, if the allocation cannot be made contiguously.
+   *
+   * @return the buffer, or null, if the request cannot be satisfied
+   * @throws IllegalStateException if called called more than once
+   */
+  ArrowBuf allocateBuffer();
+
+  /**
+   * Get the current size of the reservation (the sum of all the add()s).
+   *
+   * @return size of the current reservation
+   */
+  int getSize();
+
+  /**
+   * Return whether or not the reservation has been used.
+   *
+   * @return whether or not the reservation has been used
+   */
+  public boolean isUsed();
+
+  /**
+   * Return whether or not the reservation has been closed.
+   *
+   * @return whether or not the reservation has been closed
+   */
+  public boolean isClosed();
+
+  public void close();
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/AllocatorClosedException.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/AllocatorClosedException.java b/java/memory/src/main/java/org/apache/arrow/memory/AllocatorClosedException.java
new file mode 100644
index 0000000..5664579
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/AllocatorClosedException.java
@@ -0,0 +1,31 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory;
+
+/**
+ * Exception thrown when a closed BufferAllocator is used. Note
+ * this is an unchecked exception.
+ *
+ * @param message string associated with the cause
+ */
+@SuppressWarnings("serial")
+public class AllocatorClosedException extends RuntimeException {
+  public AllocatorClosedException(String message) {
+    super(message);
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/BaseAllocator.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/BaseAllocator.java b/java/memory/src/main/java/org/apache/arrow/memory/BaseAllocator.java
new file mode 100644
index 0000000..72f77ab
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/BaseAllocator.java
@@ -0,0 +1,781 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory;
+
+import io.netty.buffer.ArrowBuf;
+import io.netty.buffer.ByteBufAllocator;
+import io.netty.buffer.UnsafeDirectLittleEndian;
+
+import java.util.Arrays;
+import java.util.IdentityHashMap;
+import java.util.Set;
+import java.util.concurrent.atomic.AtomicInteger;
+import java.util.concurrent.atomic.AtomicLong;
+
+import org.apache.arrow.memory.AllocationManager.BufferLedger;
+import org.apache.arrow.memory.util.AssertionUtil;
+import org.apache.arrow.memory.util.HistoricalLog;
+
+import com.google.common.base.Preconditions;
+
+public abstract class BaseAllocator extends Accountant implements BufferAllocator {
+  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(BaseAllocator.class);
+
+  public static final String DEBUG_ALLOCATOR = "arrow.memory.debug.allocator";
+
+  private static final AtomicLong ID_GENERATOR = new AtomicLong(0);
+  private static final int CHUNK_SIZE = AllocationManager.INNER_ALLOCATOR.getChunkSize();
+
+  public static final int DEBUG_LOG_LENGTH = 6;
+  public static final boolean DEBUG = AssertionUtil.isAssertionsEnabled()
+      || Boolean.parseBoolean(System.getProperty(DEBUG_ALLOCATOR, "false"));
+  private final Object DEBUG_LOCK = DEBUG ? new Object() : null;
+
+  private final BaseAllocator parentAllocator;
+  private final ByteBufAllocator thisAsByteBufAllocator;
+  private final IdentityHashMap<BaseAllocator, Object> childAllocators;
+  private final ArrowBuf empty;
+
+  private volatile boolean isClosed = false; // the allocator has been closed
+
+  // Package exposed for sharing between AllocatorManger and BaseAllocator objects
+  final String name;
+  final RootAllocator root;
+
+  // members used purely for debugging
+  private final IdentityHashMap<BufferLedger, Object> childLedgers;
+  private final IdentityHashMap<Reservation, Object> reservations;
+  private final HistoricalLog historicalLog;
+
+  protected BaseAllocator(
+      final BaseAllocator parentAllocator,
+      final String name,
+      final long initReservation,
+      final long maxAllocation) throws OutOfMemoryException {
+    super(parentAllocator, initReservation, maxAllocation);
+
+    if (parentAllocator != null) {
+      this.root = parentAllocator.root;
+      empty = parentAllocator.empty;
+    } else if (this instanceof RootAllocator) {
+      this.root = (RootAllocator) this;
+      empty = createEmpty();
+    } else {
+      throw new IllegalStateException("An parent allocator must either carry a root or be the root.");
+    }
+
+    this.parentAllocator = parentAllocator;
+    this.name = name;
+
+    this.thisAsByteBufAllocator = new DrillByteBufAllocator(this);
+
+    if (DEBUG) {
+      childAllocators = new IdentityHashMap<>();
+      reservations = new IdentityHashMap<>();
+      childLedgers = new IdentityHashMap<>();
+      historicalLog = new HistoricalLog(DEBUG_LOG_LENGTH, "allocator[%s]", name);
+      hist("created by \"%s\", owned = %d", name, this.getAllocatedMemory());
+    } else {
+      childAllocators = null;
+      reservations = null;
+      historicalLog = null;
+      childLedgers = null;
+    }
+
+  }
+
+  public void assertOpen() {
+    if (AssertionUtil.ASSERT_ENABLED) {
+      if (isClosed) {
+        throw new IllegalStateException("Attempting operation on allocator when allocator is closed.\n"
+            + toVerboseString());
+      }
+    }
+  }
+
+  @Override
+  public String getName() {
+    return name;
+  }
+
+  @Override
+  public ArrowBuf getEmpty() {
+    assertOpen();
+    return empty;
+  }
+
+  /**
+   * For debug/verification purposes only. Allows an AllocationManager to tell the allocator that we have a new ledger
+   * associated with this allocator.
+   */
+  void associateLedger(BufferLedger ledger) {
+    assertOpen();
+    if (DEBUG) {
+      synchronized (DEBUG_LOCK) {
+        childLedgers.put(ledger, null);
+      }
+    }
+  }
+
+  /**
+   * For debug/verification purposes only. Allows an AllocationManager to tell the allocator that we are removing a
+   * ledger associated with this allocator
+   */
+  void dissociateLedger(BufferLedger ledger) {
+    assertOpen();
+    if (DEBUG) {
+      synchronized (DEBUG_LOCK) {
+        if (!childLedgers.containsKey(ledger)) {
+          throw new IllegalStateException("Trying to remove a child ledger that doesn't exist.");
+        }
+        childLedgers.remove(ledger);
+      }
+    }
+  }
+
+  /**
+   * Track when a ChildAllocator of this BaseAllocator is closed. Used for debugging purposes.
+   *
+   * @param childAllocator
+   *          The child allocator that has been closed.
+   */
+  private void childClosed(final BaseAllocator childAllocator) {
+    assertOpen();
+
+    if (DEBUG) {
+      Preconditions.checkArgument(childAllocator != null, "child allocator can't be null");
+
+      synchronized (DEBUG_LOCK) {
+        final Object object = childAllocators.remove(childAllocator);
+        if (object == null) {
+          childAllocator.historicalLog.logHistory(logger);
+          throw new IllegalStateException("Child allocator[" + childAllocator.name
+              + "] not found in parent allocator[" + name + "]'s childAllocators");
+        }
+      }
+    }
+  }
+
+  private static String createErrorMsg(final BufferAllocator allocator, final int rounded, final int requested) {
+    if (rounded != requested) {
+      return String.format(
+          "Unable to allocate buffer of size %d (rounded from %d) due to memory limit. Current allocation: %d",
+          rounded, requested, allocator.getAllocatedMemory());
+    } else {
+      return String.format("Unable to allocate buffer of size %d due to memory limit. Current allocation: %d",
+          rounded, allocator.getAllocatedMemory());
+    }
+  }
+
+  @Override
+  public ArrowBuf buffer(final int initialRequestSize) {
+    assertOpen();
+
+    return buffer(initialRequestSize, null);
+  }
+
+  private ArrowBuf createEmpty(){
+    assertOpen();
+
+    return new ArrowBuf(new AtomicInteger(), null, AllocationManager.INNER_ALLOCATOR.empty, null, null, 0, 0, true);
+  }
+
+  @Override
+  public ArrowBuf buffer(final int initialRequestSize, BufferManager manager) {
+    assertOpen();
+
+    Preconditions.checkArgument(initialRequestSize >= 0, "the requested size must be non-negative");
+
+    if (initialRequestSize == 0) {
+      return empty;
+    }
+
+    // round to next largest power of two if we're within a chunk since that is how our allocator operates
+    final int actualRequestSize = initialRequestSize < CHUNK_SIZE ?
+        nextPowerOfTwo(initialRequestSize)
+        : initialRequestSize;
+    AllocationOutcome outcome = this.allocateBytes(actualRequestSize);
+    if (!outcome.isOk()) {
+      throw new OutOfMemoryException(createErrorMsg(this, actualRequestSize, initialRequestSize));
+    }
+
+    boolean success = false;
+    try {
+      ArrowBuf buffer = bufferWithoutReservation(actualRequestSize, manager);
+      success = true;
+      return buffer;
+    } finally {
+      if (!success) {
+        releaseBytes(actualRequestSize);
+      }
+    }
+
+  }
+
+  /**
+   * Used by usual allocation as well as for allocating a pre-reserved buffer. Skips the typical accounting associated
+   * with creating a new buffer.
+   */
+  private ArrowBuf bufferWithoutReservation(final int size, BufferManager bufferManager) throws OutOfMemoryException {
+    assertOpen();
+
+    final AllocationManager manager = new AllocationManager(this, size);
+    final BufferLedger ledger = manager.associate(this); // +1 ref cnt (required)
+    final ArrowBuf buffer = ledger.newDrillBuf(0, size, bufferManager);
+
+    // make sure that our allocation is equal to what we expected.
+    Preconditions.checkArgument(buffer.capacity() == size,
+        "Allocated capacity %d was not equal to requested capacity %d.", buffer.capacity(), size);
+
+    return buffer;
+  }
+
+  @Override
+  public ByteBufAllocator getAsByteBufAllocator() {
+    return thisAsByteBufAllocator;
+  }
+
+  @Override
+  public BufferAllocator newChildAllocator(
+      final String name,
+      final long initReservation,
+      final long maxAllocation) {
+    assertOpen();
+
+    final ChildAllocator childAllocator = new ChildAllocator(this, name, initReservation, maxAllocation);
+
+    if (DEBUG) {
+      synchronized (DEBUG_LOCK) {
+        childAllocators.put(childAllocator, childAllocator);
+        historicalLog.recordEvent("allocator[%s] created new child allocator[%s]", name, childAllocator.name);
+      }
+    }
+
+    return childAllocator;
+  }
+
+  public class Reservation implements AllocationReservation {
+    private int nBytes = 0;
+    private boolean used = false;
+    private boolean closed = false;
+    private final HistoricalLog historicalLog;
+
+    public Reservation() {
+      if (DEBUG) {
+        historicalLog = new HistoricalLog("Reservation[allocator[%s], %d]", name, System.identityHashCode(this));
+        historicalLog.recordEvent("created");
+        synchronized (DEBUG_LOCK) {
+          reservations.put(this, this);
+        }
+      } else {
+        historicalLog = null;
+      }
+    }
+
+    public boolean add(final int nBytes) {
+      assertOpen();
+
+      Preconditions.checkArgument(nBytes >= 0, "nBytes(%d) < 0", nBytes);
+      Preconditions.checkState(!closed, "Attempt to increase reservation after reservation has been closed");
+      Preconditions.checkState(!used, "Attempt to increase reservation after reservation has been used");
+
+      // we round up to next power of two since all reservations are done in powers of two. This may overestimate the
+      // preallocation since someone may perceive additions to be power of two. If this becomes a problem, we can look
+      // at
+      // modifying this behavior so that we maintain what we reserve and what the user asked for and make sure to only
+      // round to power of two as necessary.
+      final int nBytesTwo = BaseAllocator.nextPowerOfTwo(nBytes);
+      if (!reserve(nBytesTwo)) {
+        return false;
+      }
+
+      this.nBytes += nBytesTwo;
+      return true;
+    }
+
+    public ArrowBuf allocateBuffer() {
+      assertOpen();
+
+      Preconditions.checkState(!closed, "Attempt to allocate after closed");
+      Preconditions.checkState(!used, "Attempt to allocate more than once");
+
+      final ArrowBuf drillBuf = allocate(nBytes);
+      used = true;
+      return drillBuf;
+    }
+
+    public int getSize() {
+      return nBytes;
+    }
+
+    public boolean isUsed() {
+      return used;
+    }
+
+    public boolean isClosed() {
+      return closed;
+    }
+
+    @Override
+    public void close() {
+      assertOpen();
+
+      if (closed) {
+        return;
+      }
+
+      if (DEBUG) {
+        if (!isClosed()) {
+          final Object object;
+          synchronized (DEBUG_LOCK) {
+            object = reservations.remove(this);
+          }
+          if (object == null) {
+            final StringBuilder sb = new StringBuilder();
+            print(sb, 0, Verbosity.LOG_WITH_STACKTRACE);
+            logger.debug(sb.toString());
+            throw new IllegalStateException(
+                String.format("Didn't find closing reservation[%d]", System.identityHashCode(this)));
+          }
+
+          historicalLog.recordEvent("closed");
+        }
+      }
+
+      if (!used) {
+        releaseReservation(nBytes);
+      }
+
+      closed = true;
+    }
+
+    public boolean reserve(int nBytes) {
+      assertOpen();
+
+      final AllocationOutcome outcome = BaseAllocator.this.allocateBytes(nBytes);
+
+      if (DEBUG) {
+        historicalLog.recordEvent("reserve(%d) => %s", nBytes, Boolean.toString(outcome.isOk()));
+      }
+
+      return outcome.isOk();
+    }
+
+    /**
+     * Allocate the a buffer of the requested size.
+     *
+     * <p>
+     * The implementation of the allocator's inner class provides this.
+     *
+     * @param nBytes
+     *          the size of the buffer requested
+     * @return the buffer, or null, if the request cannot be satisfied
+     */
+    private ArrowBuf allocate(int nBytes) {
+      assertOpen();
+
+      boolean success = false;
+
+      /*
+       * The reservation already added the requested bytes to the allocators owned and allocated bytes via reserve().
+       * This ensures that they can't go away. But when we ask for the buffer here, that will add to the allocated bytes
+       * as well, so we need to return the same number back to avoid double-counting them.
+       */
+      try {
+        final ArrowBuf drillBuf = BaseAllocator.this.bufferWithoutReservation(nBytes, null);
+
+        if (DEBUG) {
+          historicalLog.recordEvent("allocate() => %s", String.format("DrillBuf[%d]", drillBuf.getId()));
+        }
+        success = true;
+        return drillBuf;
+      } finally {
+        if (!success) {
+          releaseBytes(nBytes);
+        }
+      }
+    }
+
+    /**
+     * Return the reservation back to the allocator without having used it.
+     *
+     * @param nBytes
+     *          the size of the reservation
+     */
+    private void releaseReservation(int nBytes) {
+      assertOpen();
+
+      releaseBytes(nBytes);
+
+      if (DEBUG) {
+        historicalLog.recordEvent("releaseReservation(%d)", nBytes);
+      }
+    }
+
+  }
+
+  @Override
+  public AllocationReservation newReservation() {
+    assertOpen();
+
+    return new Reservation();
+  }
+
+
+  @Override
+  public synchronized void close() {
+    /*
+     * Some owners may close more than once because of complex cleanup and shutdown
+     * procedures.
+     */
+    if (isClosed) {
+      return;
+    }
+
+    isClosed = true;
+
+    if (DEBUG) {
+      synchronized(DEBUG_LOCK) {
+        verifyAllocator();
+
+        // are there outstanding child allocators?
+        if (!childAllocators.isEmpty()) {
+          for (final BaseAllocator childAllocator : childAllocators.keySet()) {
+            if (childAllocator.isClosed) {
+              logger.warn(String.format(
+                  "Closed child allocator[%s] on parent allocator[%s]'s child list.\n%s",
+                  childAllocator.name, name, toString()));
+            }
+          }
+
+          throw new IllegalStateException(
+              String.format("Allocator[%s] closed with outstanding child allocators.\n%s", name, toString()));
+        }
+
+        // are there outstanding buffers?
+        final int allocatedCount = childLedgers.size();
+        if (allocatedCount > 0) {
+          throw new IllegalStateException(
+              String.format("Allocator[%s] closed with outstanding buffers allocated (%d).\n%s",
+                  name, allocatedCount, toString()));
+        }
+
+        if (reservations.size() != 0) {
+          throw new IllegalStateException(
+              String.format("Allocator[%s] closed with outstanding reservations (%d).\n%s", name, reservations.size(),
+                  toString()));
+        }
+
+      }
+    }
+
+    // Is there unaccounted-for outstanding allocation?
+    final long allocated = getAllocatedMemory();
+    if (allocated > 0) {
+      throw new IllegalStateException(
+          String.format("Memory was leaked by query. Memory leaked: (%d)\n%s", allocated, toString()));
+    }
+
+    // we need to release our memory to our parent before we tell it we've closed.
+    super.close();
+
+    // Inform our parent allocator that we've closed
+    if (parentAllocator != null) {
+      parentAllocator.childClosed(this);
+    }
+
+    if (DEBUG) {
+      historicalLog.recordEvent("closed");
+      logger.debug(String.format(
+          "closed allocator[%s].",
+          name));
+    }
+
+
+  }
+
+  public String toString() {
+    final Verbosity verbosity = logger.isTraceEnabled() ? Verbosity.LOG_WITH_STACKTRACE
+        : Verbosity.BASIC;
+    final StringBuilder sb = new StringBuilder();
+    print(sb, 0, verbosity);
+    return sb.toString();
+  }
+
+  /**
+   * Provide a verbose string of the current allocator state. Includes the state of all child allocators, along with
+   * historical logs of each object and including stacktraces.
+   *
+   * @return A Verbose string of current allocator state.
+   */
+  public String toVerboseString() {
+    final StringBuilder sb = new StringBuilder();
+    print(sb, 0, Verbosity.LOG_WITH_STACKTRACE);
+    return sb.toString();
+  }
+
+  private void hist(String noteFormat, Object... args) {
+    historicalLog.recordEvent(noteFormat, args);
+  }
+
+  /**
+   * Rounds up the provided value to the nearest power of two.
+   *
+   * @param val
+   *          An integer value.
+   * @return The closest power of two of that value.
+   */
+  static int nextPowerOfTwo(int val) {
+    int highestBit = Integer.highestOneBit(val);
+    if (highestBit == val) {
+      return val;
+    } else {
+      return highestBit << 1;
+    }
+  }
+
+
+  /**
+   * Verifies the accounting state of the allocator. Only works for DEBUG.
+   *
+   * @throws IllegalStateException
+   *           when any problems are found
+   */
+  void verifyAllocator() {
+    final IdentityHashMap<UnsafeDirectLittleEndian, BaseAllocator> buffersSeen = new IdentityHashMap<>();
+    verifyAllocator(buffersSeen);
+  }
+
+  /**
+   * Verifies the accounting state of the allocator. Only works for DEBUG.
+   *
+   * <p>
+   * This overload is used for recursive calls, allowing for checking that DrillBufs are unique across all allocators
+   * that are checked.
+   * </p>
+   *
+   * @param buffersSeen
+   *          a map of buffers that have already been seen when walking a tree of allocators
+   * @throws IllegalStateException
+   *           when any problems are found
+   */
+  private void verifyAllocator(final IdentityHashMap<UnsafeDirectLittleEndian, BaseAllocator> buffersSeen) {
+    synchronized (DEBUG_LOCK) {
+
+      // The remaining tests can only be performed if we're in debug mode.
+      if (!DEBUG) {
+        return;
+      }
+
+      final long allocated = getAllocatedMemory();
+
+      // verify my direct descendants
+      final Set<BaseAllocator> childSet = childAllocators.keySet();
+      for (final BaseAllocator childAllocator : childSet) {
+        childAllocator.verifyAllocator(buffersSeen);
+      }
+
+      /*
+       * Verify my relationships with my descendants.
+       *
+       * The sum of direct child allocators' owned memory must be <= my allocated memory; my allocated memory also
+       * includes DrillBuf's directly allocated by me.
+       */
+      long childTotal = 0;
+      for (final BaseAllocator childAllocator : childSet) {
+        childTotal += Math.max(childAllocator.getAllocatedMemory(), childAllocator.reservation);
+      }
+      if (childTotal > getAllocatedMemory()) {
+        historicalLog.logHistory(logger);
+        logger.debug("allocator[" + name + "] child event logs BEGIN");
+        for (final BaseAllocator childAllocator : childSet) {
+          childAllocator.historicalLog.logHistory(logger);
+        }
+        logger.debug("allocator[" + name + "] child event logs END");
+        throw new IllegalStateException(
+            "Child allocators own more memory (" + childTotal + ") than their parent (name = "
+                + name + " ) has allocated (" + getAllocatedMemory() + ')');
+      }
+
+      // Furthermore, the amount I've allocated should be that plus buffers I've allocated.
+      long bufferTotal = 0;
+
+      final Set<BufferLedger> ledgerSet = childLedgers.keySet();
+      for (final BufferLedger ledger : ledgerSet) {
+        if (!ledger.isOwningLedger()) {
+          continue;
+        }
+
+        final UnsafeDirectLittleEndian udle = ledger.getUnderlying();
+        /*
+         * Even when shared, DrillBufs are rewrapped, so we should never see the same instance twice.
+         */
+        final BaseAllocator otherOwner = buffersSeen.get(udle);
+        if (otherOwner != null) {
+          throw new IllegalStateException("This allocator's drillBuf already owned by another allocator");
+        }
+        buffersSeen.put(udle, this);
+
+        bufferTotal += udle.capacity();
+      }
+
+      // Preallocated space has to be accounted for
+      final Set<Reservation> reservationSet = reservations.keySet();
+      long reservedTotal = 0;
+      for (final Reservation reservation : reservationSet) {
+        if (!reservation.isUsed()) {
+          reservedTotal += reservation.getSize();
+        }
+      }
+
+      if (bufferTotal + reservedTotal + childTotal != getAllocatedMemory()) {
+        final StringBuilder sb = new StringBuilder();
+        sb.append("allocator[");
+        sb.append(name);
+        sb.append("]\nallocated: ");
+        sb.append(Long.toString(allocated));
+        sb.append(" allocated - (bufferTotal + reservedTotal + childTotal): ");
+        sb.append(Long.toString(allocated - (bufferTotal + reservedTotal + childTotal)));
+        sb.append('\n');
+
+        if (bufferTotal != 0) {
+          sb.append("buffer total: ");
+          sb.append(Long.toString(bufferTotal));
+          sb.append('\n');
+          dumpBuffers(sb, ledgerSet);
+        }
+
+        if (childTotal != 0) {
+          sb.append("child total: ");
+          sb.append(Long.toString(childTotal));
+          sb.append('\n');
+
+          for (final BaseAllocator childAllocator : childSet) {
+            sb.append("child allocator[");
+            sb.append(childAllocator.name);
+            sb.append("] owned ");
+            sb.append(Long.toString(childAllocator.getAllocatedMemory()));
+            sb.append('\n');
+          }
+        }
+
+        if (reservedTotal != 0) {
+          sb.append(String.format("reserved total : %d bytes.", reservedTotal));
+          for (final Reservation reservation : reservationSet) {
+            reservation.historicalLog.buildHistory(sb, 0, true);
+            sb.append('\n');
+          }
+        }
+
+        logger.debug(sb.toString());
+
+        final long allocated2 = getAllocatedMemory();
+
+        if (allocated2 != allocated) {
+          throw new IllegalStateException(String.format(
+              "allocator[%s]: allocated t1 (%d) + allocated t2 (%d). Someone released memory while in verification.",
+              name, allocated, allocated2));
+
+        }
+        throw new IllegalStateException(String.format(
+            "allocator[%s]: buffer space (%d) + prealloc space (%d) + child space (%d) != allocated (%d)",
+            name, bufferTotal, reservedTotal, childTotal, allocated));
+      }
+    }
+  }
+
+  void print(StringBuilder sb, int level, Verbosity verbosity) {
+
+    indent(sb, level)
+        .append("Allocator(")
+        .append(name)
+        .append(") ")
+        .append(reservation)
+        .append('/')
+        .append(getAllocatedMemory())
+        .append('/')
+        .append(getPeakMemoryAllocation())
+        .append('/')
+        .append(getLimit())
+        .append(" (res/actual/peak/limit)")
+        .append('\n');
+
+    if (DEBUG) {
+      indent(sb, level + 1).append(String.format("child allocators: %d\n", childAllocators.size()));
+      for (BaseAllocator child : childAllocators.keySet()) {
+        child.print(sb, level + 2, verbosity);
+      }
+
+      indent(sb, level + 1).append(String.format("ledgers: %d\n", childLedgers.size()));
+      for (BufferLedger ledger : childLedgers.keySet()) {
+        ledger.print(sb, level + 2, verbosity);
+      }
+
+      final Set<Reservation> reservations = this.reservations.keySet();
+      indent(sb, level + 1).append(String.format("reservations: %d\n", reservations.size()));
+      for (final Reservation reservation : reservations) {
+        if (verbosity.includeHistoricalLog) {
+          reservation.historicalLog.buildHistory(sb, level + 3, true);
+        }
+      }
+
+    }
+
+  }
+
+  private void dumpBuffers(final StringBuilder sb, final Set<BufferLedger> ledgerSet) {
+    for (final BufferLedger ledger : ledgerSet) {
+      if (!ledger.isOwningLedger()) {
+        continue;
+      }
+      final UnsafeDirectLittleEndian udle = ledger.getUnderlying();
+      sb.append("UnsafeDirectLittleEndian[dentityHashCode == ");
+      sb.append(Integer.toString(System.identityHashCode(udle)));
+      sb.append("] size ");
+      sb.append(Integer.toString(udle.capacity()));
+      sb.append('\n');
+    }
+  }
+
+
+  public static StringBuilder indent(StringBuilder sb, int indent) {
+    final char[] indentation = new char[indent * 2];
+    Arrays.fill(indentation, ' ');
+    sb.append(indentation);
+    return sb;
+  }
+
+  public static enum Verbosity {
+    BASIC(false, false), // only include basic information
+    LOG(true, false), // include basic
+    LOG_WITH_STACKTRACE(true, true) //
+    ;
+
+    public final boolean includeHistoricalLog;
+    public final boolean includeStackTraces;
+
+    Verbosity(boolean includeHistoricalLog, boolean includeStackTraces) {
+      this.includeHistoricalLog = includeHistoricalLog;
+      this.includeStackTraces = includeStackTraces;
+    }
+  }
+
+  public static boolean isDebug() {
+    return DEBUG;
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/BoundsChecking.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/BoundsChecking.java b/java/memory/src/main/java/org/apache/arrow/memory/BoundsChecking.java
new file mode 100644
index 0000000..4e88c73
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/BoundsChecking.java
@@ -0,0 +1,35 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory;
+
+public class BoundsChecking {
+  static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(BoundsChecking.class);
+
+  public static final boolean BOUNDS_CHECKING_ENABLED;
+
+  static {
+    boolean isAssertEnabled = false;
+    assert isAssertEnabled = true;
+    BOUNDS_CHECKING_ENABLED = isAssertEnabled
+        || !"true".equals(System.getProperty("drill.enable_unsafe_memory_access"));
+  }
+
+  private BoundsChecking() {
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/BufferAllocator.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/BufferAllocator.java b/java/memory/src/main/java/org/apache/arrow/memory/BufferAllocator.java
new file mode 100644
index 0000000..16a6812
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/BufferAllocator.java
@@ -0,0 +1,151 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory;
+
+import io.netty.buffer.ByteBufAllocator;
+import io.netty.buffer.ArrowBuf;
+
+/**
+ * Wrapper class to deal with byte buffer allocation. Ensures users only use designated methods.
+ */
+public interface BufferAllocator extends AutoCloseable {
+  /**
+   * Allocate a new or reused buffer of the provided size. Note that the buffer may technically be larger than the
+   * requested size for rounding purposes. However, the buffer's capacity will be set to the configured size.
+   *
+   * @param size
+   *          The size in bytes.
+   * @return a new DrillBuf, or null if the request can't be satisfied
+   * @throws OutOfMemoryException
+   *           if buffer cannot be allocated
+   */
+  public ArrowBuf buffer(int size);
+
+  /**
+   * Allocate a new or reused buffer of the provided size. Note that the buffer may technically be larger than the
+   * requested size for rounding purposes. However, the buffer's capacity will be set to the configured size.
+   *
+   * @param size
+   *          The size in bytes.
+   * @param manager
+   *          A buffer manager to manage reallocation.
+   * @return a new DrillBuf, or null if the request can't be satisfied
+   * @throws OutOfMemoryException
+   *           if buffer cannot be allocated
+   */
+  public ArrowBuf buffer(int size, BufferManager manager);
+
+  /**
+   * Returns the allocator this allocator falls back to when it needs more memory.
+   *
+   * @return the underlying allocator used by this allocator
+   */
+  public ByteBufAllocator getAsByteBufAllocator();
+
+  /**
+   * Create a new child allocator.
+   *
+   * @param name
+   *          the name of the allocator.
+   * @param initReservation
+   *          the initial space reservation (obtained from this allocator)
+   * @param maxAllocation
+   *          maximum amount of space the new allocator can allocate
+   * @return the new allocator, or null if it can't be created
+   */
+  public BufferAllocator newChildAllocator(String name, long initReservation, long maxAllocation);
+
+  /**
+   * Close and release all buffers generated from this buffer pool.
+   *
+   * <p>When assertions are on, complains if there are any outstanding buffers; to avoid
+   * that, release all buffers before the allocator is closed.
+   */
+  @Override
+  public void close();
+
+  /**
+   * Returns the amount of memory currently allocated from this allocator.
+   *
+   * @return the amount of memory currently allocated
+   */
+  public long getAllocatedMemory();
+
+  /**
+   * Set the maximum amount of memory this allocator is allowed to allocate.
+   *
+   * @param newLimit
+   *          The new Limit to apply to allocations
+   */
+  public void setLimit(long newLimit);
+
+  /**
+   * Return the current maximum limit this allocator imposes.
+   *
+   * @return Limit in number of bytes.
+   */
+  public long getLimit();
+
+  /**
+   * Returns the peak amount of memory allocated from this allocator.
+   *
+   * @return the peak amount of memory allocated
+   */
+  public long getPeakMemoryAllocation();
+
+  /**
+   * Create an allocation reservation. A reservation is a way of building up
+   * a request for a buffer whose size is not known in advance. See
+   * {@see AllocationReservation}.
+   *
+   * @return the newly created reservation
+   */
+  public AllocationReservation newReservation();
+
+  /**
+   * Get a reference to the empty buffer associated with this allocator. Empty buffers are special because we don't
+   * worry about them leaking or managing reference counts on them since they don't actually point to any memory.
+   */
+  public ArrowBuf getEmpty();
+
+  /**
+   * Return the name of this allocator. This is a human readable name that can help debugging. Typically provides
+   * coordinates about where this allocator was created
+   */
+  public String getName();
+
+  /**
+   * Return whether or not this allocator (or one if its parents) is over its limits. In the case that an allocator is
+   * over its limit, all consumers of that allocator should aggressively try to addrss the overlimit situation.
+   */
+  public boolean isOverLimit();
+
+  /**
+   * Return a verbose string describing this allocator. If in DEBUG mode, this will also include relevant stacktraces
+   * and historical logs for underlying objects
+   *
+   * @return A very verbose description of the allocator hierarchy.
+   */
+  public String toVerboseString();
+
+  /**
+   * Asserts (using java assertions) that the provided allocator is currently open. If assertions are disabled, this is
+   * a no-op.
+   */
+  public void assertOpen();
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/BufferManager.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/BufferManager.java b/java/memory/src/main/java/org/apache/arrow/memory/BufferManager.java
new file mode 100644
index 0000000..0610ff0
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/BufferManager.java
@@ -0,0 +1,66 @@
+/*******************************************************************************
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ ******************************************************************************/
+package org.apache.arrow.memory;
+
+import io.netty.buffer.ArrowBuf;
+
+/**
+ * Manages a list of {@link ArrowBuf}s that can be reallocated as needed. Upon
+ * re-allocation the old buffer will be freed. Managing a list of these buffers
+ * prevents some parts of the system from needing to define a correct location
+ * to place the final call to free them.
+ *
+ * The current uses of these types of buffers are within the pluggable components of Drill.
+ * In UDFs, memory management should not be a concern. We provide access to re-allocatable
+ * DrillBufs to give UDF writers general purpose buffers we can account for. To prevent the need
+ * for UDFs to contain boilerplate to close all of the buffers they request, this list
+ * is tracked at a higher level and all of the buffers are freed once we are sure that
+ * the code depending on them is done executing (currently {@link FragmentContext}
+ * and {@link QueryContext}.
+ */
+public interface BufferManager extends AutoCloseable {
+
+  /**
+   * Replace an old buffer with a new version at least of the provided size. Does not copy data.
+   *
+   * @param old
+   *          Old Buffer that the user is no longer going to use.
+   * @param newSize
+   *          Size of new replacement buffer.
+   * @return
+   */
+  public ArrowBuf replace(ArrowBuf old, int newSize);
+
+  /**
+   * Get a managed buffer of indeterminate size.
+   *
+   * @return A buffer.
+   */
+  public ArrowBuf getManagedBuffer();
+
+  /**
+   * Get a managed buffer of at least a certain size.
+   *
+   * @param size
+   *          The desired size
+   * @return A buffer
+   */
+  public ArrowBuf getManagedBuffer(int size);
+
+  public void close();
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/ChildAllocator.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/ChildAllocator.java b/java/memory/src/main/java/org/apache/arrow/memory/ChildAllocator.java
new file mode 100644
index 0000000..6f120e5
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/ChildAllocator.java
@@ -0,0 +1,53 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory;
+
+
+/**
+ * Child allocator class. Only slightly different from the {@see RootAllocator},
+ * in that these can't be created directly, but must be obtained from
+ * {@see BufferAllocator#newChildAllocator(AllocatorOwner, long, long, int)}.
+
+ * <p>Child allocators can only be created by the root, or other children, so
+ * this class is package private.</p>
+ */
+class ChildAllocator extends BaseAllocator {
+  /**
+   * Constructor.
+   *
+   * @param parentAllocator parent allocator -- the one creating this child
+   * @param allocatorOwner a handle to the object making the request
+   * @param allocationPolicy the allocation policy to use; the policy for all
+   *   allocators must match for each invocation of a drillbit
+   * @param initReservation initial amount of space to reserve (obtained from the parent)
+   * @param maxAllocation maximum amount of space that can be obtained from this allocator;
+   *   note this includes direct allocations (via {@see BufferAllocator#buffer(int, int)}
+   *   et al) and requests from descendant allocators. Depending on the allocation policy in
+   *   force, even less memory may be available
+   * @param flags one or more of BaseAllocator.F_* flags
+   */
+  ChildAllocator(
+      BaseAllocator parentAllocator,
+      String name,
+      long initReservation,
+      long maxAllocation) {
+    super(parentAllocator, name, initReservation, maxAllocation);
+  }
+
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/DrillByteBufAllocator.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/DrillByteBufAllocator.java b/java/memory/src/main/java/org/apache/arrow/memory/DrillByteBufAllocator.java
new file mode 100644
index 0000000..23d6448
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/DrillByteBufAllocator.java
@@ -0,0 +1,141 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory;
+
+import io.netty.buffer.ByteBuf;
+import io.netty.buffer.ByteBufAllocator;
+import io.netty.buffer.CompositeByteBuf;
+import io.netty.buffer.ExpandableByteBuf;
+
+/**
+ * An implementation of ByteBufAllocator that wraps a Drill BufferAllocator. This allows the RPC layer to be accounted
+ * and managed using Drill's BufferAllocator infrastructure. The only thin different from a typical BufferAllocator is
+ * the signature and the fact that this Allocator returns ExpandableByteBufs which enable otherwise non-expandable
+ * DrillBufs to be expandable.
+ */
+public class DrillByteBufAllocator implements ByteBufAllocator {
+
+  private static final int DEFAULT_BUFFER_SIZE = 4096;
+  private static final int DEFAULT_MAX_COMPOSITE_COMPONENTS = 16;
+
+  private final BufferAllocator allocator;
+
+  public DrillByteBufAllocator(BufferAllocator allocator) {
+    this.allocator = allocator;
+  }
+
+  @Override
+  public ByteBuf buffer() {
+    return buffer(DEFAULT_BUFFER_SIZE);
+  }
+
+  @Override
+  public ByteBuf buffer(int initialCapacity) {
+    return new ExpandableByteBuf(allocator.buffer(initialCapacity), allocator);
+  }
+
+  @Override
+  public ByteBuf buffer(int initialCapacity, int maxCapacity) {
+    return buffer(initialCapacity);
+  }
+
+  @Override
+  public ByteBuf ioBuffer() {
+    return buffer();
+  }
+
+  @Override
+  public ByteBuf ioBuffer(int initialCapacity) {
+    return buffer(initialCapacity);
+  }
+
+  @Override
+  public ByteBuf ioBuffer(int initialCapacity, int maxCapacity) {
+    return buffer(initialCapacity);
+  }
+
+  @Override
+  public ByteBuf directBuffer() {
+    return buffer();
+  }
+
+  @Override
+  public ByteBuf directBuffer(int initialCapacity) {
+    return allocator.buffer(initialCapacity);
+  }
+
+  @Override
+  public ByteBuf directBuffer(int initialCapacity, int maxCapacity) {
+    return buffer(initialCapacity, maxCapacity);
+  }
+
+  @Override
+  public CompositeByteBuf compositeBuffer() {
+    return compositeBuffer(DEFAULT_MAX_COMPOSITE_COMPONENTS);
+  }
+
+  @Override
+  public CompositeByteBuf compositeBuffer(int maxNumComponents) {
+    return new CompositeByteBuf(this, true, maxNumComponents);
+  }
+
+  @Override
+  public CompositeByteBuf compositeDirectBuffer() {
+    return compositeBuffer();
+  }
+
+  @Override
+  public CompositeByteBuf compositeDirectBuffer(int maxNumComponents) {
+    return compositeBuffer(maxNumComponents);
+  }
+
+  @Override
+  public boolean isDirectBufferPooled() {
+    return false;
+  }
+
+  @Override
+  public ByteBuf heapBuffer() {
+    throw fail();
+  }
+
+  @Override
+  public ByteBuf heapBuffer(int initialCapacity) {
+    throw fail();
+  }
+
+  @Override
+  public ByteBuf heapBuffer(int initialCapacity, int maxCapacity) {
+    throw fail();
+  }
+
+  @Override
+  public CompositeByteBuf compositeHeapBuffer() {
+    throw fail();
+  }
+
+  @Override
+  public CompositeByteBuf compositeHeapBuffer(int maxNumComponents) {
+    throw fail();
+  }
+
+  private RuntimeException fail() {
+    throw new UnsupportedOperationException("Allocator doesn't support heap-based memory.");
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/OutOfMemoryException.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/OutOfMemoryException.java b/java/memory/src/main/java/org/apache/arrow/memory/OutOfMemoryException.java
new file mode 100644
index 0000000..6ba0284
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/OutOfMemoryException.java
@@ -0,0 +1,50 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory;
+
+
+public class OutOfMemoryException extends RuntimeException {
+  private static final long serialVersionUID = -6858052345185793382L;
+
+  static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(OutOfMemoryException.class);
+
+  public OutOfMemoryException() {
+    super();
+  }
+
+  public OutOfMemoryException(String message, Throwable cause, boolean enableSuppression, boolean writableStackTrace) {
+    super(message, cause, enableSuppression, writableStackTrace);
+  }
+
+  public OutOfMemoryException(String message, Throwable cause) {
+    super(message, cause);
+
+  }
+
+  public OutOfMemoryException(String message) {
+    super(message);
+
+  }
+
+  public OutOfMemoryException(Throwable cause) {
+    super(cause);
+
+  }
+
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/README.md
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/README.md b/java/memory/src/main/java/org/apache/arrow/memory/README.md
new file mode 100644
index 0000000..09e4257
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/README.md
@@ -0,0 +1,121 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+# Memory: Allocation, Accounting and Management
+ 
+The memory management package contains all the memory allocation related items that Arrow uses to manage memory.
+
+
+## Key Components
+Memory management can be broken into the following main components:
+
+- Memory chunk allocation and fragmentation management
+  - `PooledByteBufAllocatorL` - A LittleEndian clone of Netty's jemalloc implementation
+  - `UnsafeDirectLittleEndian` - A base level memory access interface
+  - `LargeBuffer` - A buffer backing implementation used when working with data larger than one Netty chunk (default to 16mb)
+- Memory limits & Accounting
+  - `Accountant` - A nestable class of lockfree memory accountors.
+- Application-level memory allocation
+  - `BufferAllocator` - The public interface application users should be leveraging
+  - `BaseAllocator` - The base implementation of memory allocation, contains the meat of our the Arrow allocator implementation
+  - `RootAllocator` - The root allocator. Typically only one created for a JVM
+  - `ChildAllocator` - A child allocator that derives from the root allocator
+- Buffer ownership and transfer capabilities
+  - `AllocationManager` - Responsible for managing the relationship between multiple allocators and a single chunk of memory
+  - `BufferLedger` - Responsible for allowing maintaining the relationship between an `AllocationManager`, a `BufferAllocator` and one or more individual `ArrowBuf`s 
+- Memory access
+  - `ArrowBuf` - The facade for interacting directly with a chunk of memory.
+ 
+
+## Memory Management Overview
+Arrow's memory model is based on the following basic concepts:
+
+ - Memory can be allocated up to some limit. That limit could be a real limit (OS/JVM) or a locally imposed limit.
+ - Allocation operates in two phases: accounting then actual allocation. Allocation could fail at either point.
+ - Allocation failure should be recoverable. In all cases, the Allocator infrastructure should expose memory allocation failures (OS or internal limit-based) as `OutOfMemoryException`s.
+ - Any allocator can reserve memory when created. This memory shall be held such that this allocator will always be able to allocate that amount of memory.
+ - A particular application component should work to use a local allocator to understand local memory usage and better debug memory leaks.
+ - The same physical memory can be shared by multiple allocators and the allocator must provide an accounting paradigm for this purpose.
+
+## Allocator Trees
+
+Arrow provides a tree-based model for memory allocation. The RootAllocator is created first, then all allocators are created as children of that allocator. The RootAllocator is responsible for being the master bookeeper for memory allocations. All other allocators are created as children of this tree. Each allocator can first determine whether it has enough local memory to satisfy a particular request. If not, the allocator can ask its parent for an additional memory allocation.
+
+## Reserving Memory
+
+Arrow provides two different ways to reserve memory:
+
+  - BufferAllocator accounting reservations: 
+      When a new allocator (other than the `RootAllocator`) is initialized, it can set aside memory that it will keep locally for its lifetime. This is memory that will never be released back to its parent allocator until the allocator is closed.
+  - `AllocationReservation` via BufferAllocator.newReservation(): Allows a short-term preallocation strategy so that a particular subsystem can ensure future memory is available to support a particular request.
+  
+## Memory Ownership, Reference Counts and Sharing
+Many BufferAllocators can reference the same piece of memory at the same time. The most common situation for this is in the case of a Broadcast Join: in this situation many downstream operators in the same Arrowbit will receive the same physical memory. Each of these operators will be operating within its own Allocator context. We therefore have multiple allocators all pointing at the same physical memory. It is the AllocationManager's responsibility to ensure that in this situation, that all memory is accurately accounted for from the Root's perspective and also to ensure that the memory is correctly released once all BufferAllocators have stopped using that memory.
+
+For simplicity of accounting, we treat that memory as being used by one of the BufferAllocators associated with the memory. When that allocator releases its claim on that memory, the memory ownership is then moved to another BufferLedger belonging to the same AllocationManager. Note that because a ArrowBuf.release() is what actually causes memory ownership transfer to occur, we always precede with ownership transfer (even if that violates an allocator limit). It is the responsibility of the application owning a particular allocator to frequently confirm whether the allocator is over its memory limit (BufferAllocator.isOverLimit()) and if so, attempt to aggresively release memory to ameliorate the situation.
+
+All ArrowBufs (direct or sliced) related to a single BufferLedger/BufferAllocator combination share the same reference count and either all will be valid or all will be invalid.
+
+## Object Hierarchy
+
+There are two main ways that someone can look at the object hierarchy for Arrow's memory management scheme. The first is a memory based perspective as below:
+
+### Memory Perspective
+<pre>
++ AllocationManager
+|
+|-- UnsignedDirectLittleEndian (One per AllocationManager)
+|
+|-+ BufferLedger 1 ==> Allocator A (owning)
+| ` - ArrowBuf 1
+|-+ BufferLedger 2 ==> Allocator B (non-owning)
+| ` - ArrowBuf 2
+|-+ BufferLedger 3 ==> Allocator C (non-owning)
+  | - ArrowBuf 3
+  | - ArrowBuf 4
+  ` - ArrowBuf 5
+</pre>
+
+In this picture, a piece of memory is owned by an allocator manager. An allocator manager is responsible for that piece of memory no matter which allocator(s) it is working with. An allocator manager will have relationships with a piece of raw memory (via its reference to UnsignedDirectLittleEndian) as well as references to each BufferAllocator it has a relationship to. 
+
+### Allocator Perspective
+<pre>
++ RootAllocator
+|-+ ChildAllocator 1
+| | - ChildAllocator 1.1
+| ` ...
+|
+|-+ ChildAllocator 2
+|-+ ChildAllocator 3
+| |
+| |-+ BufferLedger 1 ==> AllocationManager 1 (owning) ==> UDLE
+| | `- ArrowBuf 1
+| `-+ BufferLedger 2 ==> AllocationManager 2 (non-owning)==> UDLE
+| 	`- ArrowBuf 2
+|
+|-+ BufferLedger 3 ==> AllocationManager 1 (non-owning)==> UDLE
+| ` - ArrowBuf 3
+|-+ BufferLedger 4 ==> AllocationManager 2 (owning) ==> UDLE
+  | - ArrowBuf 4
+  | - ArrowBuf 5
+  ` - ArrowBuf 6
+</pre>
+
+In this picture, a RootAllocator owns three ChildAllocators. The first ChildAllocator (ChildAllocator 1) owns a subsequent ChildAllocator. ChildAllocator has two BufferLedgers/AllocationManager references. Coincidentally, each of these AllocationManager's is also associated with the RootAllocator. In this case, one of the these AllocationManagers is owned by ChildAllocator 3 (AllocationManager 1) while the other AllocationManager (AllocationManager 2) is owned/accounted for by the RootAllocator. Note that in this scenario, ArrowBuf 1 is sharing the underlying memory as ArrowBuf 3. However the subset of that memory (e.g. through slicing) might be different. Also note that ArrowBuf 2 and ArrowBuf 4, 5 and 6 are also sharing the same underlying memory. Also note that ArrowBuf 4, 5 and 6 all share the same reference count and fate.
+
+## Debugging Issues
+The Allocator object provides a useful set of tools to better understand the status of the allocator. If in `DEBUG` mode, the allocator and supporting classes will record additional debug tracking information to better track down memory leaks and issues. To enable DEBUG mode, either enable Java assertions with `-ea` or pass the following system property to the VM when starting `-Darrow.memory.debug.allocator=true`. The BufferAllocator also provides a `BufferAllocator.toVerboseString()` which can be used in DEBUG mode to get extensive stacktrace information and events associated with various Allocator behaviors.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/RootAllocator.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/RootAllocator.java b/java/memory/src/main/java/org/apache/arrow/memory/RootAllocator.java
new file mode 100644
index 0000000..571fc37
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/RootAllocator.java
@@ -0,0 +1,39 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory;
+
+import com.google.common.annotations.VisibleForTesting;
+
+/**
+ * The root allocator for using direct memory inside a Drillbit. Supports creating a
+ * tree of descendant child allocators.
+ */
+public class RootAllocator extends BaseAllocator {
+
+  public RootAllocator(final long limit) {
+    super(null, "ROOT", 0, limit);
+  }
+
+  /**
+   * Verify the accounting state of the allocation system.
+   */
+  @VisibleForTesting
+  public void verify() {
+    verifyAllocator();
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/package-info.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/package-info.java b/java/memory/src/main/java/org/apache/arrow/memory/package-info.java
new file mode 100644
index 0000000..712af30
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/package-info.java
@@ -0,0 +1,24 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+/**
+ *  Memory Allocation, Account and Management
+ *
+ *  See the README.md file in this directory for detailed information about Drill's memory allocation subsystem.
+ *
+ */
+package org.apache.arrow.memory;

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/util/AssertionUtil.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/util/AssertionUtil.java b/java/memory/src/main/java/org/apache/arrow/memory/util/AssertionUtil.java
new file mode 100644
index 0000000..28d0785
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/util/AssertionUtil.java
@@ -0,0 +1,37 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory.util;
+
+public class AssertionUtil {
+  static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(AssertionUtil.class);
+
+  public static final boolean ASSERT_ENABLED;
+
+  static{
+    boolean isAssertEnabled = false;
+    assert isAssertEnabled = true;
+    ASSERT_ENABLED = isAssertEnabled;
+  }
+
+  public static boolean isAssertionsEnabled(){
+    return ASSERT_ENABLED;
+  }
+
+  private AssertionUtil() {
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/util/AutoCloseableLock.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/util/AutoCloseableLock.java b/java/memory/src/main/java/org/apache/arrow/memory/util/AutoCloseableLock.java
new file mode 100644
index 0000000..94e5cc5
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/util/AutoCloseableLock.java
@@ -0,0 +1,43 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory.util;
+
+import java.util.concurrent.locks.Lock;
+
+/**
+ * Simple wrapper class that allows Locks to be released via an try-with-resources block.
+ */
+public class AutoCloseableLock implements AutoCloseable {
+
+  private final Lock lock;
+
+  public AutoCloseableLock(Lock lock) {
+    this.lock = lock;
+  }
+
+  public AutoCloseableLock open() {
+    lock.lock();
+    return this;
+  }
+
+  @Override
+  public void close() {
+    lock.unlock();
+  }
+
+}


[06/17] arrow git commit: ARROW-1: Initial Arrow Code Commit

Posted by ja...@apache.org.
http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/UnionVector.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/UnionVector.java b/java/vector/src/main/codegen/templates/UnionVector.java
new file mode 100644
index 0000000..ba94ac2
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/UnionVector.java
@@ -0,0 +1,467 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+<@pp.dropOutputFile />
+<@pp.changeOutputFile name="/org/apache/arrow/vector/complex/UnionVector.java" />
+
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.complex;
+
+<#include "/@includes/vv_imports.ftl" />
+import java.util.ArrayList;
+import java.util.Iterator;
+import org.apache.arrow.vector.complex.impl.ComplexCopier;
+import org.apache.arrow.vector.util.CallBack;
+import org.apache.arrow.vector.util.BasicTypeHelper;
+
+/*
+ * This class is generated using freemarker and the ${.template_name} template.
+ */
+@SuppressWarnings("unused")
+
+
+/**
+ * A vector which can hold values of different types. It does so by using a MapVector which contains a vector for each
+ * primitive type that is stored. MapVector is used in order to take advantage of its serialization/deserialization methods,
+ * as well as the addOrGet method.
+ *
+ * For performance reasons, UnionVector stores a cached reference to each subtype vector, to avoid having to do the map lookup
+ * each time the vector is accessed.
+ */
+public class UnionVector implements ValueVector {
+
+  private MaterializedField field;
+  private BufferAllocator allocator;
+  private Accessor accessor = new Accessor();
+  private Mutator mutator = new Mutator();
+  private int valueCount;
+
+  private MapVector internalMap;
+  private UInt1Vector typeVector;
+
+  private MapVector mapVector;
+  private ListVector listVector;
+
+  private FieldReader reader;
+  private NullableBitVector bit;
+
+  private int singleType = 0;
+  private ValueVector singleVector;
+  private MajorType majorType;
+
+  private final CallBack callBack;
+
+  public UnionVector(MaterializedField field, BufferAllocator allocator, CallBack callBack) {
+    this.field = field.clone();
+    this.allocator = allocator;
+    this.internalMap = new MapVector("internal", allocator, callBack);
+    this.typeVector = internalMap.addOrGet("types", new MajorType(MinorType.UINT1, DataMode.REQUIRED), UInt1Vector.class);
+    this.field.addChild(internalMap.getField().clone());
+    this.majorType = field.getType();
+    this.callBack = callBack;
+  }
+
+  public BufferAllocator getAllocator() {
+    return allocator;
+  }
+
+  public List<MinorType> getSubTypes() {
+    return majorType.getSubTypes();
+  }
+
+  public void addSubType(MinorType type) {
+    if (majorType.getSubTypes().contains(type)) {
+      return;
+    }
+    List<MinorType> subTypes = this.majorType.getSubTypes();
+    List<MinorType> newSubTypes = new ArrayList<>(subTypes);
+    newSubTypes.add(type);
+    majorType =  new MajorType(this.majorType.getMinorType(), this.majorType.getMode(), this.majorType.getPrecision(),
+            this.majorType.getScale(), this.majorType.getTimezone(), newSubTypes);
+    field = MaterializedField.create(field.getName(), majorType);
+    if (callBack != null) {
+      callBack.doWork();
+    }
+  }
+
+  private static final MajorType MAP_TYPE = new MajorType(MinorType.MAP, DataMode.OPTIONAL);
+
+  public MapVector getMap() {
+    if (mapVector == null) {
+      int vectorCount = internalMap.size();
+      mapVector = internalMap.addOrGet("map", MAP_TYPE, MapVector.class);
+      addSubType(MinorType.MAP);
+      if (internalMap.size() > vectorCount) {
+        mapVector.allocateNew();
+      }
+    }
+    return mapVector;
+  }
+
+  <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first />
+  <#assign fields = minor.fields!type.fields />
+  <#assign uncappedName = name?uncap_first/>
+  <#if !minor.class?starts_with("Decimal")>
+
+  private Nullable${name}Vector ${uncappedName}Vector;
+  private static final MajorType ${name?upper_case}_TYPE = new MajorType(MinorType.${name?upper_case}, DataMode.OPTIONAL);
+
+  public Nullable${name}Vector get${name}Vector() {
+    if (${uncappedName}Vector == null) {
+      int vectorCount = internalMap.size();
+      ${uncappedName}Vector = internalMap.addOrGet("${uncappedName}", ${name?upper_case}_TYPE, Nullable${name}Vector.class);
+      addSubType(MinorType.${name?upper_case});
+      if (internalMap.size() > vectorCount) {
+        ${uncappedName}Vector.allocateNew();
+      }
+    }
+    return ${uncappedName}Vector;
+  }
+
+  </#if>
+
+  </#list></#list>
+
+  private static final MajorType LIST_TYPE = new MajorType(MinorType.LIST, DataMode.OPTIONAL);
+
+  public ListVector getList() {
+    if (listVector == null) {
+      int vectorCount = internalMap.size();
+      listVector = internalMap.addOrGet("list", LIST_TYPE, ListVector.class);
+      addSubType(MinorType.LIST);
+      if (internalMap.size() > vectorCount) {
+        listVector.allocateNew();
+      }
+    }
+    return listVector;
+  }
+
+  public int getTypeValue(int index) {
+    return typeVector.getAccessor().get(index);
+  }
+
+  public UInt1Vector getTypeVector() {
+    return typeVector;
+  }
+
+  @Override
+  public void allocateNew() throws OutOfMemoryException {
+    internalMap.allocateNew();
+    if (typeVector != null) {
+      typeVector.zeroVector();
+    }
+  }
+
+  @Override
+  public boolean allocateNewSafe() {
+    boolean safe = internalMap.allocateNewSafe();
+    if (safe) {
+      if (typeVector != null) {
+        typeVector.zeroVector();
+      }
+    }
+    return safe;
+  }
+
+  @Override
+  public void setInitialCapacity(int numRecords) {
+  }
+
+  @Override
+  public int getValueCapacity() {
+    return Math.min(typeVector.getValueCapacity(), internalMap.getValueCapacity());
+  }
+
+  @Override
+  public void close() {
+  }
+
+  @Override
+  public void clear() {
+    internalMap.clear();
+  }
+
+  @Override
+  public MaterializedField getField() {
+    return field;
+  }
+
+  @Override
+  public TransferPair getTransferPair(BufferAllocator allocator) {
+    return new TransferImpl(field, allocator);
+  }
+
+  @Override
+  public TransferPair getTransferPair(String ref, BufferAllocator allocator) {
+    return new TransferImpl(field.withPath(ref), allocator);
+  }
+
+  @Override
+  public TransferPair makeTransferPair(ValueVector target) {
+    return new TransferImpl((UnionVector) target);
+  }
+
+  public void transferTo(UnionVector target) {
+    internalMap.makeTransferPair(target.internalMap).transfer();
+    target.valueCount = valueCount;
+    target.majorType = majorType;
+  }
+
+  public void copyFrom(int inIndex, int outIndex, UnionVector from) {
+    from.getReader().setPosition(inIndex);
+    getWriter().setPosition(outIndex);
+    ComplexCopier.copy(from.reader, mutator.writer);
+  }
+
+  public void copyFromSafe(int inIndex, int outIndex, UnionVector from) {
+    copyFrom(inIndex, outIndex, from);
+  }
+
+  public ValueVector addVector(ValueVector v) {
+    String name = v.getField().getType().getMinorType().name().toLowerCase();
+    MajorType type = v.getField().getType();
+    Preconditions.checkState(internalMap.getChild(name) == null, String.format("%s vector already exists", name));
+    final ValueVector newVector = internalMap.addOrGet(name, type, (Class<ValueVector>) BasicTypeHelper.getValueVectorClass(type.getMinorType(), type.getMode()));
+    v.makeTransferPair(newVector).transfer();
+    internalMap.putChild(name, newVector);
+    addSubType(v.getField().getType().getMinorType());
+    return newVector;
+  }
+
+  private class TransferImpl implements TransferPair {
+
+    UnionVector to;
+
+    public TransferImpl(MaterializedField field, BufferAllocator allocator) {
+      to = new UnionVector(field, allocator, null);
+    }
+
+    public TransferImpl(UnionVector to) {
+      this.to = to;
+    }
+
+    @Override
+    public void transfer() {
+      transferTo(to);
+    }
+
+    @Override
+    public void splitAndTransfer(int startIndex, int length) {
+
+    }
+
+    @Override
+    public ValueVector getTo() {
+      return to;
+    }
+
+    @Override
+    public void copyValueSafe(int from, int to) {
+      this.to.copyFrom(from, to, UnionVector.this);
+    }
+  }
+
+  @Override
+  public Accessor getAccessor() {
+    return accessor;
+  }
+
+  @Override
+  public Mutator getMutator() {
+    return mutator;
+  }
+
+  @Override
+  public FieldReader getReader() {
+    if (reader == null) {
+      reader = new UnionReader(this);
+    }
+    return reader;
+  }
+
+  public FieldWriter getWriter() {
+    if (mutator.writer == null) {
+      mutator.writer = new UnionWriter(this);
+    }
+    return mutator.writer;
+  }
+
+//  @Override
+//  public UserBitShared.SerializedField getMetadata() {
+//    SerializedField.Builder b = getField() //
+//            .getAsBuilder() //
+//            .setBufferLength(getBufferSize()) //
+//            .setValueCount(valueCount);
+//
+//    b.addChild(internalMap.getMetadata());
+//    return b.build();
+//  }
+
+  @Override
+  public int getBufferSize() {
+    return internalMap.getBufferSize();
+  }
+
+  @Override
+  public int getBufferSizeFor(final int valueCount) {
+    if (valueCount == 0) {
+      return 0;
+    }
+
+    long bufferSize = 0;
+    for (final ValueVector v : (Iterable<ValueVector>) this) {
+      bufferSize += v.getBufferSizeFor(valueCount);
+    }
+
+    return (int) bufferSize;
+  }
+
+  @Override
+  public ArrowBuf[] getBuffers(boolean clear) {
+    return internalMap.getBuffers(clear);
+  }
+
+  @Override
+  public Iterator<ValueVector> iterator() {
+    List<ValueVector> vectors = Lists.newArrayList(internalMap.iterator());
+    vectors.add(typeVector);
+    return vectors.iterator();
+  }
+
+  public class Accessor extends BaseValueVector.BaseAccessor {
+
+
+    @Override
+    public Object getObject(int index) {
+      int type = typeVector.getAccessor().get(index);
+      switch (MinorType.values()[type]) {
+      case LATE:
+        return null;
+      <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first />
+      <#assign fields = minor.fields!type.fields />
+      <#assign uncappedName = name?uncap_first/>
+      <#if !minor.class?starts_with("Decimal")>
+      case ${name?upper_case}:
+        return get${name}Vector().getAccessor().getObject(index);
+      </#if>
+
+      </#list></#list>
+      case MAP:
+        return getMap().getAccessor().getObject(index);
+      case LIST:
+        return getList().getAccessor().getObject(index);
+      default:
+        throw new UnsupportedOperationException("Cannot support type: " + MinorType.values()[type]);
+      }
+    }
+
+    public byte[] get(int index) {
+      return null;
+    }
+
+    public void get(int index, ComplexHolder holder) {
+    }
+
+    public void get(int index, UnionHolder holder) {
+      FieldReader reader = new UnionReader(UnionVector.this);
+      reader.setPosition(index);
+      holder.reader = reader;
+    }
+
+    @Override
+    public int getValueCount() {
+      return valueCount;
+    }
+
+    @Override
+    public boolean isNull(int index) {
+      return typeVector.getAccessor().get(index) == 0;
+    }
+
+    public int isSet(int index) {
+      return isNull(index) ? 0 : 1;
+    }
+  }
+
+  public class Mutator extends BaseValueVector.BaseMutator {
+
+    UnionWriter writer;
+
+    @Override
+    public void setValueCount(int valueCount) {
+      UnionVector.this.valueCount = valueCount;
+      internalMap.getMutator().setValueCount(valueCount);
+    }
+
+    public void setSafe(int index, UnionHolder holder) {
+      FieldReader reader = holder.reader;
+      if (writer == null) {
+        writer = new UnionWriter(UnionVector.this);
+      }
+      writer.setPosition(index);
+      MinorType type = reader.getType().getMinorType();
+      switch (type) {
+      <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first />
+      <#assign fields = minor.fields!type.fields />
+      <#assign uncappedName = name?uncap_first/>
+      <#if !minor.class?starts_with("Decimal")>
+      case ${name?upper_case}:
+        Nullable${name}Holder ${uncappedName}Holder = new Nullable${name}Holder();
+        reader.read(${uncappedName}Holder);
+        setSafe(index, ${uncappedName}Holder);
+        break;
+      </#if>
+      </#list></#list>
+      case MAP: {
+        ComplexCopier.copy(reader, writer);
+        break;
+      }
+      case LIST: {
+        ComplexCopier.copy(reader, writer);
+        break;
+      }
+      default:
+        throw new UnsupportedOperationException();
+      }
+    }
+
+    <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first />
+    <#assign fields = minor.fields!type.fields />
+    <#assign uncappedName = name?uncap_first/>
+    <#if !minor.class?starts_with("Decimal")>
+    public void setSafe(int index, Nullable${name}Holder holder) {
+      setType(index, MinorType.${name?upper_case});
+      get${name}Vector().getMutator().setSafe(index, holder);
+    }
+
+    </#if>
+    </#list></#list>
+
+    public void setType(int index, MinorType type) {
+      typeVector.getMutator().setSafe(index, type.ordinal());
+    }
+
+    @Override
+    public void reset() { }
+
+    @Override
+    public void generateTestData(int values) { }
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/UnionWriter.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/UnionWriter.java b/java/vector/src/main/codegen/templates/UnionWriter.java
new file mode 100644
index 0000000..c9c29e0
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/UnionWriter.java
@@ -0,0 +1,228 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+<@pp.dropOutputFile />
+<@pp.changeOutputFile name="/org/apache/arrow/vector/complex/impl/UnionWriter.java" />
+
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.complex.impl;
+
+<#include "/@includes/vv_imports.ftl" />
+
+/*
+ * This class is generated using freemarker and the ${.template_name} template.
+ */
+@SuppressWarnings("unused")
+public class UnionWriter extends AbstractFieldWriter implements FieldWriter {
+
+  UnionVector data;
+  private MapWriter mapWriter;
+  private UnionListWriter listWriter;
+  private List<BaseWriter> writers = Lists.newArrayList();
+
+  public UnionWriter(BufferAllocator allocator) {
+    super(null);
+  }
+
+  public UnionWriter(UnionVector vector) {
+    super(null);
+    data = vector;
+  }
+
+  public UnionWriter(UnionVector vector, FieldWriter parent) {
+    super(null);
+    data = vector;
+  }
+
+  @Override
+  public void setPosition(int index) {
+    super.setPosition(index);
+    for (BaseWriter writer : writers) {
+      writer.setPosition(index);
+    }
+  }
+
+
+  @Override
+  public void start() {
+    data.getMutator().setType(idx(), MinorType.MAP);
+    getMapWriter().start();
+  }
+
+  @Override
+  public void end() {
+    getMapWriter().end();
+  }
+
+  @Override
+  public void startList() {
+    getListWriter().startList();
+    data.getMutator().setType(idx(), MinorType.LIST);
+  }
+
+  @Override
+  public void endList() {
+    getListWriter().endList();
+  }
+
+  private MapWriter getMapWriter() {
+    if (mapWriter == null) {
+      mapWriter = new SingleMapWriter(data.getMap(), null, true);
+      mapWriter.setPosition(idx());
+      writers.add(mapWriter);
+    }
+    return mapWriter;
+  }
+
+  public MapWriter asMap() {
+    data.getMutator().setType(idx(), MinorType.MAP);
+    return getMapWriter();
+  }
+
+  private ListWriter getListWriter() {
+    if (listWriter == null) {
+      listWriter = new UnionListWriter(data.getList());
+      listWriter.setPosition(idx());
+      writers.add(listWriter);
+    }
+    return listWriter;
+  }
+
+  public ListWriter asList() {
+    data.getMutator().setType(idx(), MinorType.LIST);
+    return getListWriter();
+  }
+
+  <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first />
+  <#assign fields = minor.fields!type.fields />
+  <#assign uncappedName = name?uncap_first/>
+
+          <#if !minor.class?starts_with("Decimal")>
+
+  private ${name}Writer ${name?uncap_first}Writer;
+
+  private ${name}Writer get${name}Writer() {
+    if (${uncappedName}Writer == null) {
+      ${uncappedName}Writer = new Nullable${name}WriterImpl(data.get${name}Vector(), null);
+      ${uncappedName}Writer.setPosition(idx());
+      writers.add(${uncappedName}Writer);
+    }
+    return ${uncappedName}Writer;
+  }
+
+  public ${name}Writer as${name}() {
+    data.getMutator().setType(idx(), MinorType.${name?upper_case});
+    return get${name}Writer();
+  }
+
+  @Override
+  public void write(${name}Holder holder) {
+    data.getMutator().setType(idx(), MinorType.${name?upper_case});
+    get${name}Writer().setPosition(idx());
+    get${name}Writer().write${name}(<#list fields as field>holder.${field.name}<#if field_has_next>, </#if></#list>);
+  }
+
+  public void write${minor.class}(<#list fields as field>${field.type} ${field.name}<#if field_has_next>, </#if></#list>) {
+    data.getMutator().setType(idx(), MinorType.${name?upper_case});
+    get${name}Writer().setPosition(idx());
+    get${name}Writer().write${name}(<#list fields as field>${field.name}<#if field_has_next>, </#if></#list>);
+  }
+  </#if>
+
+  </#list></#list>
+
+  public void writeNull() {
+  }
+
+  @Override
+  public MapWriter map() {
+    data.getMutator().setType(idx(), MinorType.LIST);
+    getListWriter().setPosition(idx());
+    return getListWriter().map();
+  }
+
+  @Override
+  public ListWriter list() {
+    data.getMutator().setType(idx(), MinorType.LIST);
+    getListWriter().setPosition(idx());
+    return getListWriter().list();
+  }
+
+  @Override
+  public ListWriter list(String name) {
+    data.getMutator().setType(idx(), MinorType.MAP);
+    getMapWriter().setPosition(idx());
+    return getMapWriter().list(name);
+  }
+
+  @Override
+  public MapWriter map(String name) {
+    data.getMutator().setType(idx(), MinorType.MAP);
+    getMapWriter().setPosition(idx());
+    return getMapWriter().map(name);
+  }
+
+  <#list vv.types as type><#list type.minor as minor>
+  <#assign lowerName = minor.class?uncap_first />
+  <#if lowerName == "int" ><#assign lowerName = "integer" /></#if>
+  <#assign upperName = minor.class?upper_case />
+  <#assign capName = minor.class?cap_first />
+  <#if !minor.class?starts_with("Decimal")>
+  @Override
+  public ${capName}Writer ${lowerName}(String name) {
+    data.getMutator().setType(idx(), MinorType.MAP);
+    getMapWriter().setPosition(idx());
+    return getMapWriter().${lowerName}(name);
+  }
+
+  @Override
+  public ${capName}Writer ${lowerName}() {
+    data.getMutator().setType(idx(), MinorType.LIST);
+    getListWriter().setPosition(idx());
+    return getListWriter().${lowerName}();
+  }
+  </#if>
+  </#list></#list>
+
+  @Override
+  public void allocate() {
+    data.allocateNew();
+  }
+
+  @Override
+  public void clear() {
+    data.clear();
+  }
+
+  @Override
+  public void close() throws Exception {
+    data.close();
+  }
+
+  @Override
+  public MaterializedField getField() {
+    return data.getField();
+  }
+
+  @Override
+  public int getValueCapacity() {
+    return data.getValueCapacity();
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/ValueHolders.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/ValueHolders.java b/java/vector/src/main/codegen/templates/ValueHolders.java
new file mode 100644
index 0000000..2b14194
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/ValueHolders.java
@@ -0,0 +1,116 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+<@pp.dropOutputFile />
+<#list vv.modes as mode>
+<#list vv.types as type>
+<#list type.minor as minor>
+
+<#assign className="${mode.prefix}${minor.class}Holder" />
+<@pp.changeOutputFile name="/org/apache/arrow/vector/holders/${className}.java" />
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.holders;
+
+<#include "/@includes/vv_imports.ftl" />
+
+public final class ${className} implements ValueHolder{
+  
+  public static final MajorType TYPE = new MajorType(MinorType.${minor.class?upper_case}, DataMode.${mode.name?upper_case});
+
+  public MajorType getType() {return TYPE;}
+  
+    <#if mode.name == "Repeated">
+    
+    /** The first index (inclusive) into the Vector. **/
+    public int start;
+    
+    /** The last index (exclusive) into the Vector. **/
+    public int end;
+    
+    /** The Vector holding the actual values. **/
+    public ${minor.class}Vector vector;
+    
+    <#else>
+    public static final int WIDTH = ${type.width};
+    
+    <#if mode.name == "Optional">public int isSet;</#if>
+    <#assign fields = minor.fields!type.fields />
+    <#list fields as field>
+    public ${field.type} ${field.name};
+    </#list>
+    
+    <#if minor.class.startsWith("Decimal")>
+    public static final int maxPrecision = ${minor.maxPrecisionDigits};
+    <#if minor.class.startsWith("Decimal28") || minor.class.startsWith("Decimal38")>
+    public static final int nDecimalDigits = ${minor.nDecimalDigits};
+    
+    public static int getInteger(int index, int start, ArrowBuf buffer) {
+      int value = buffer.getInt(start + (index * 4));
+
+      if (index == 0) {
+          /* the first byte contains sign bit, return value without it */
+          <#if minor.class.endsWith("Sparse")>
+          value = (value & 0x7FFFFFFF);
+          <#elseif minor.class.endsWith("Dense")>
+          value = (value & 0x0000007F);
+          </#if>
+      }
+      return value;
+    }
+
+    public static void setInteger(int index, int value, int start, ArrowBuf buffer) {
+        buffer.setInt(start + (index * 4), value);
+    }
+  
+    public static void setSign(boolean sign, int start, ArrowBuf buffer) {
+      // Set MSB to 1 if sign is negative
+      if (sign == true) {
+        int value = getInteger(0, start, buffer);
+        setInteger(0, (value | 0x80000000), start, buffer);
+      }
+    }
+  
+    public static boolean getSign(int start, ArrowBuf buffer) {
+      return ((buffer.getInt(start) & 0x80000000) != 0);
+    }
+    </#if></#if>
+    
+    @Deprecated
+    public int hashCode(){
+      throw new UnsupportedOperationException();
+    }
+
+    /*
+     * Reason for deprecation is that ValueHolders are potential scalar replacements
+     * and hence we don't want any methods to be invoked on them.
+     */
+    @Deprecated
+    public String toString(){
+      throw new UnsupportedOperationException();
+    }
+    </#if>
+    
+    
+    
+    
+}
+
+</#list>
+</#list>
+</#list>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/VariableLengthVectors.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/VariableLengthVectors.java b/java/vector/src/main/codegen/templates/VariableLengthVectors.java
new file mode 100644
index 0000000..13d53b8
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/VariableLengthVectors.java
@@ -0,0 +1,644 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import java.lang.Override;
+
+import org.apache.drill.exec.exception.OutOfMemoryException;
+import org.apache.drill.exec.vector.BaseDataValueVector;
+import org.apache.drill.exec.vector.BaseValueVector;
+import org.apache.drill.exec.vector.VariableWidthVector;
+
+<@pp.dropOutputFile />
+<#list vv.types as type>
+<#list type.minor as minor>
+
+<#assign friendlyType = (minor.friendlyType!minor.boxedType!type.boxedType) />
+
+<#if type.major == "VarLen">
+<@pp.changeOutputFile name="/org/apache/arrow/vector/${minor.class}Vector.java" />
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector;
+
+<#include "/@includes/vv_imports.ftl" />
+
+/**
+ * ${minor.class}Vector implements a vector of variable width values.  Elements in the vector
+ * are accessed by position from the logical start of the vector.  A fixed width offsetVector
+ * is used to convert an element's position to it's offset from the start of the (0-based)
+ * ArrowBuf.  Size is inferred by adjacent elements.
+ *   The width of each element is ${type.width} byte(s)
+ *   The equivalent Java primitive is '${minor.javaType!type.javaType}'
+ *
+ * NB: this class is automatically generated from ${.template_name} and ValueVectorTypes.tdd using FreeMarker.
+ */
+public final class ${minor.class}Vector extends BaseDataValueVector implements VariableWidthVector{
+  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(${minor.class}Vector.class);
+
+  private static final int DEFAULT_RECORD_BYTE_COUNT = 8;
+  private static final int INITIAL_BYTE_COUNT = 4096 * DEFAULT_RECORD_BYTE_COUNT;
+  private static final int MIN_BYTE_COUNT = 4096;
+
+  public final static String OFFSETS_VECTOR_NAME = "$offsets$";
+  private final MaterializedField offsetsField = MaterializedField.create(OFFSETS_VECTOR_NAME, new MajorType(MinorType.UINT4, DataMode.REQUIRED));
+  private final UInt${type.width}Vector offsetVector = new UInt${type.width}Vector(offsetsField, allocator);
+  private final FieldReader reader = new ${minor.class}ReaderImpl(${minor.class}Vector.this);
+
+  private final Accessor accessor;
+  private final Mutator mutator;
+
+  private final UInt${type.width}Vector.Accessor oAccessor;
+
+  private int allocationSizeInBytes = INITIAL_BYTE_COUNT;
+  private int allocationMonitor = 0;
+
+  public ${minor.class}Vector(MaterializedField field, BufferAllocator allocator) {
+    super(field, allocator);
+    this.oAccessor = offsetVector.getAccessor();
+    this.accessor = new Accessor();
+    this.mutator = new Mutator();
+  }
+
+  @Override
+  public FieldReader getReader(){
+    return reader;
+  }
+
+  @Override
+  public int getBufferSize(){
+    if (getAccessor().getValueCount() == 0) {
+      return 0;
+    }
+    return offsetVector.getBufferSize() + data.writerIndex();
+  }
+
+  @Override
+  public int getBufferSizeFor(final int valueCount) {
+    if (valueCount == 0) {
+      return 0;
+    }
+
+    final int idx = offsetVector.getAccessor().get(valueCount);
+    return offsetVector.getBufferSizeFor(valueCount + 1) + idx;
+  }
+
+  @Override
+  public int getValueCapacity(){
+    return Math.max(offsetVector.getValueCapacity() - 1, 0);
+  }
+
+  @Override
+  public int getByteCapacity(){
+    return data.capacity();
+  }
+
+  @Override
+  public int getCurrentSizeInBytes() {
+    return offsetVector.getAccessor().get(getAccessor().getValueCount());
+  }
+
+  /**
+   * Return the number of bytes contained in the current var len byte vector.
+   * @return
+   */
+  public int getVarByteLength(){
+    final int valueCount = getAccessor().getValueCount();
+    if(valueCount == 0) {
+      return 0;
+    }
+    return offsetVector.getAccessor().get(valueCount);
+  }
+
+//  @Override
+//  public SerializedField getMetadata() {
+//    return getMetadataBuilder() //
+//             .addChild(offsetVector.getMetadata())
+//             .setValueCount(getAccessor().getValueCount()) //
+//             .setBufferLength(getBufferSize()) //
+//             .build();
+//  }
+//
+//  @Override
+//  public void load(SerializedField metadata, ArrowBuf buffer) {
+//     the bits vector is the first child (the order in which the children are added in getMetadataBuilder is significant)
+//    final SerializedField offsetField = metadata.getChild(0);
+//    offsetVector.load(offsetField, buffer);
+//
+//    final int capacity = buffer.capacity();
+//    final int offsetsLength = offsetField.getBufferLength();
+//    data = buffer.slice(offsetsLength, capacity - offsetsLength);
+//    data.retain();
+//  }
+
+  @Override
+  public void clear() {
+    super.clear();
+    offsetVector.clear();
+  }
+
+  @Override
+  public ArrowBuf[] getBuffers(boolean clear) {
+    final ArrowBuf[] buffers = ObjectArrays.concat(offsetVector.getBuffers(false), super.getBuffers(false), ArrowBuf.class);
+    if (clear) {
+      // does not make much sense but we have to retain buffers even when clear is set. refactor this interface.
+      for (final ArrowBuf buffer:buffers) {
+        buffer.retain(1);
+      }
+      clear();
+    }
+    return buffers;
+  }
+
+  public long getOffsetAddr(){
+    return offsetVector.getBuffer().memoryAddress();
+  }
+
+  public UInt${type.width}Vector getOffsetVector(){
+    return offsetVector;
+  }
+
+  @Override
+  public TransferPair getTransferPair(BufferAllocator allocator){
+    return new TransferImpl(getField(), allocator);
+  }
+
+  @Override
+  public TransferPair getTransferPair(String ref, BufferAllocator allocator){
+    return new TransferImpl(getField().withPath(ref), allocator);
+  }
+
+  @Override
+  public TransferPair makeTransferPair(ValueVector to) {
+    return new TransferImpl((${minor.class}Vector) to);
+  }
+
+  public void transferTo(${minor.class}Vector target){
+    target.clear();
+    this.offsetVector.transferTo(target.offsetVector);
+    target.data = data.transferOwnership(target.allocator).buffer;
+    target.data.writerIndex(data.writerIndex());
+    clear();
+  }
+
+  public void splitAndTransferTo(int startIndex, int length, ${minor.class}Vector target) {
+    UInt${type.width}Vector.Accessor offsetVectorAccessor = this.offsetVector.getAccessor();
+    final int startPoint = offsetVectorAccessor.get(startIndex);
+    final int sliceLength = offsetVectorAccessor.get(startIndex + length) - startPoint;
+    target.clear();
+    target.offsetVector.allocateNew(length + 1);
+    offsetVectorAccessor = this.offsetVector.getAccessor();
+    final UInt4Vector.Mutator targetOffsetVectorMutator = target.offsetVector.getMutator();
+    for (int i = 0; i < length + 1; i++) {
+      targetOffsetVectorMutator.set(i, offsetVectorAccessor.get(startIndex + i) - startPoint);
+    }
+    target.data = data.slice(startPoint, sliceLength).transferOwnership(target.allocator).buffer;
+    target.getMutator().setValueCount(length);
+}
+
+  protected void copyFrom(int fromIndex, int thisIndex, ${minor.class}Vector from){
+    final UInt4Vector.Accessor fromOffsetVectorAccessor = from.offsetVector.getAccessor();
+    final int start = fromOffsetVectorAccessor.get(fromIndex);
+    final int end = fromOffsetVectorAccessor.get(fromIndex + 1);
+    final int len = end - start;
+
+    final int outputStart = offsetVector.data.get${(minor.javaType!type.javaType)?cap_first}(thisIndex * ${type.width});
+    from.data.getBytes(start, data, outputStart, len);
+    offsetVector.data.set${(minor.javaType!type.javaType)?cap_first}( (thisIndex+1) * ${type.width}, outputStart + len);
+  }
+
+  public boolean copyFromSafe(int fromIndex, int thisIndex, ${minor.class}Vector from){
+    final UInt${type.width}Vector.Accessor fromOffsetVectorAccessor = from.offsetVector.getAccessor();
+    final int start = fromOffsetVectorAccessor.get(fromIndex);
+    final int end =   fromOffsetVectorAccessor.get(fromIndex + 1);
+    final int len = end - start;
+    final int outputStart = offsetVector.data.get${(minor.javaType!type.javaType)?cap_first}(thisIndex * ${type.width});
+
+    while(data.capacity() < outputStart + len) {
+        reAlloc();
+    }
+
+    offsetVector.getMutator().setSafe(thisIndex + 1, outputStart + len);
+    from.data.getBytes(start, data, outputStart, len);
+    return true;
+  }
+
+  private class TransferImpl implements TransferPair{
+    ${minor.class}Vector to;
+
+    public TransferImpl(MaterializedField field, BufferAllocator allocator){
+      to = new ${minor.class}Vector(field, allocator);
+    }
+
+    public TransferImpl(${minor.class}Vector to){
+      this.to = to;
+    }
+
+    @Override
+    public ${minor.class}Vector getTo(){
+      return to;
+    }
+
+    @Override
+    public void transfer(){
+      transferTo(to);
+    }
+
+    @Override
+    public void splitAndTransfer(int startIndex, int length) {
+      splitAndTransferTo(startIndex, length, to);
+    }
+
+    @Override
+    public void copyValueSafe(int fromIndex, int toIndex) {
+      to.copyFromSafe(fromIndex, toIndex, ${minor.class}Vector.this);
+    }
+  }
+
+  @Override
+  public void setInitialCapacity(final int valueCount) {
+    final long size = 1L * valueCount * ${type.width};
+    if (size > MAX_ALLOCATION_SIZE) {
+      throw new OversizedAllocationException("Requested amount of memory is more than max allowed allocation size");
+    }
+    allocationSizeInBytes = (int)size;
+    offsetVector.setInitialCapacity(valueCount + 1);
+  }
+
+  @Override
+  public void allocateNew() {
+    if(!allocateNewSafe()){
+      throw new OutOfMemoryException("Failure while allocating buffer.");
+    }
+  }
+
+  @Override
+  public boolean allocateNewSafe() {
+    long curAllocationSize = allocationSizeInBytes;
+    if (allocationMonitor > 10) {
+      curAllocationSize = Math.max(MIN_BYTE_COUNT, curAllocationSize / 2);
+      allocationMonitor = 0;
+    } else if (allocationMonitor < -2) {
+      curAllocationSize = curAllocationSize * 2L;
+      allocationMonitor = 0;
+    }
+
+    if (curAllocationSize > MAX_ALLOCATION_SIZE) {
+      return false;
+    }
+
+    clear();
+    /* Boolean to keep track if all the memory allocations were successful
+     * Used in the case of composite vectors when we need to allocate multiple
+     * buffers for multiple vectors. If one of the allocations failed we need to
+     * clear all the memory that we allocated
+     */
+    try {
+      final int requestedSize = (int)curAllocationSize;
+      data = allocator.buffer(requestedSize);
+      allocationSizeInBytes = requestedSize;
+      offsetVector.allocateNew();
+    } catch (OutOfMemoryException e) {
+      clear();
+      return false;
+    }
+    data.readerIndex(0);
+    offsetVector.zeroVector();
+    return true;
+  }
+
+  @Override
+  public void allocateNew(int totalBytes, int valueCount) {
+    clear();
+    assert totalBytes >= 0;
+    try {
+      data = allocator.buffer(totalBytes);
+      offsetVector.allocateNew(valueCount + 1);
+    } catch (RuntimeException e) {
+      clear();
+      throw e;
+    }
+    data.readerIndex(0);
+    allocationSizeInBytes = totalBytes;
+    offsetVector.zeroVector();
+  }
+
+  @Override
+  public void reset() {
+    allocationSizeInBytes = INITIAL_BYTE_COUNT;
+    allocationMonitor = 0;
+    data.readerIndex(0);
+    offsetVector.zeroVector();
+    super.reset();
+  }
+
+  public void reAlloc() {
+    final long newAllocationSize = allocationSizeInBytes*2L;
+    if (newAllocationSize > MAX_ALLOCATION_SIZE)  {
+      throw new OversizedAllocationException("Unable to expand the buffer. Max allowed buffer size is reached.");
+    }
+
+    final ArrowBuf newBuf = allocator.buffer((int)newAllocationSize);
+    newBuf.setBytes(0, data, 0, data.capacity());
+    data.release();
+    data = newBuf;
+    allocationSizeInBytes = (int)newAllocationSize;
+  }
+
+  public void decrementAllocationMonitor() {
+    if (allocationMonitor > 0) {
+      allocationMonitor = 0;
+    }
+    --allocationMonitor;
+  }
+
+  private void incrementAllocationMonitor() {
+    ++allocationMonitor;
+  }
+
+  @Override
+  public Accessor getAccessor(){
+    return accessor;
+  }
+
+  @Override
+  public Mutator getMutator() {
+    return mutator;
+  }
+
+  public final class Accessor extends BaseValueVector.BaseAccessor implements VariableWidthAccessor {
+    final UInt${type.width}Vector.Accessor oAccessor = offsetVector.getAccessor();
+    public long getStartEnd(int index){
+      return oAccessor.getTwoAsLong(index);
+    }
+
+    public byte[] get(int index) {
+      assert index >= 0;
+      final int startIdx = oAccessor.get(index);
+      final int length = oAccessor.get(index + 1) - startIdx;
+      assert length >= 0;
+      final byte[] dst = new byte[length];
+      data.getBytes(startIdx, dst, 0, length);
+      return dst;
+    }
+
+    @Override
+    public int getValueLength(int index) {
+      final UInt${type.width}Vector.Accessor offsetVectorAccessor = offsetVector.getAccessor();
+      return offsetVectorAccessor.get(index + 1) - offsetVectorAccessor.get(index);
+    }
+
+    public void get(int index, ${minor.class}Holder holder){
+      holder.start = oAccessor.get(index);
+      holder.end = oAccessor.get(index + 1);
+      holder.buffer = data;
+    }
+
+    public void get(int index, Nullable${minor.class}Holder holder){
+      holder.isSet = 1;
+      holder.start = oAccessor.get(index);
+      holder.end = oAccessor.get(index + 1);
+      holder.buffer = data;
+    }
+
+
+    <#switch minor.class>
+    <#case "VarChar">
+    @Override
+    public ${friendlyType} getObject(int index) {
+      Text text = new Text();
+      text.set(get(index));
+      return text;
+    }
+    <#break>
+    <#case "Var16Char">
+    @Override
+    public ${friendlyType} getObject(int index) {
+      return new String(get(index), Charsets.UTF_16);
+    }
+    <#break>
+    <#default>
+    @Override
+    public ${friendlyType} getObject(int index) {
+      return get(index);
+    }
+    </#switch>
+
+    @Override
+    public int getValueCount() {
+      return Math.max(offsetVector.getAccessor().getValueCount()-1, 0);
+    }
+
+    @Override
+    public boolean isNull(int index){
+      return false;
+    }
+
+    public UInt${type.width}Vector getOffsetVector(){
+      return offsetVector;
+    }
+  }
+
+  /**
+   * Mutable${minor.class} implements a vector of variable width values.  Elements in the vector
+   * are accessed by position from the logical start of the vector.  A fixed width offsetVector
+   * is used to convert an element's position to it's offset from the start of the (0-based)
+   * ArrowBuf.  Size is inferred by adjacent elements.
+   *   The width of each element is ${type.width} byte(s)
+   *   The equivalent Java primitive is '${minor.javaType!type.javaType}'
+   *
+   * NB: this class is automatically generated from ValueVectorTypes.tdd using FreeMarker.
+   */
+  public final class Mutator extends BaseValueVector.BaseMutator implements VariableWidthVector.VariableWidthMutator {
+
+    /**
+     * Set the variable length element at the specified index to the supplied byte array.
+     *
+     * @param index   position of the bit to set
+     * @param bytes   array of bytes to write
+     */
+    protected void set(int index, byte[] bytes) {
+      assert index >= 0;
+      final int currentOffset = offsetVector.getAccessor().get(index);
+      offsetVector.getMutator().set(index + 1, currentOffset + bytes.length);
+      data.setBytes(currentOffset, bytes, 0, bytes.length);
+    }
+
+    public void setSafe(int index, byte[] bytes) {
+      assert index >= 0;
+
+      final int currentOffset = offsetVector.getAccessor().get(index);
+      while (data.capacity() < currentOffset + bytes.length) {
+        reAlloc();
+      }
+      offsetVector.getMutator().setSafe(index + 1, currentOffset + bytes.length);
+      data.setBytes(currentOffset, bytes, 0, bytes.length);
+    }
+
+    /**
+     * Set the variable length element at the specified index to the supplied byte array.
+     *
+     * @param index   position of the bit to set
+     * @param bytes   array of bytes to write
+     * @param start   start index of bytes to write
+     * @param length  length of bytes to write
+     */
+    protected void set(int index, byte[] bytes, int start, int length) {
+      assert index >= 0;
+      final int currentOffset = offsetVector.getAccessor().get(index);
+      offsetVector.getMutator().set(index + 1, currentOffset + length);
+      data.setBytes(currentOffset, bytes, start, length);
+    }
+
+    public void setSafe(int index, ByteBuffer bytes, int start, int length) {
+      assert index >= 0;
+
+      int currentOffset = offsetVector.getAccessor().get(index);
+
+      while (data.capacity() < currentOffset + length) {
+        reAlloc();
+      }
+      offsetVector.getMutator().setSafe(index + 1, currentOffset + length);
+      data.setBytes(currentOffset, bytes, start, length);
+    }
+
+    public void setSafe(int index, byte[] bytes, int start, int length) {
+      assert index >= 0;
+
+      final int currentOffset = offsetVector.getAccessor().get(index);
+
+      while (data.capacity() < currentOffset + length) {
+        reAlloc();
+      }
+      offsetVector.getMutator().setSafe(index + 1, currentOffset + length);
+      data.setBytes(currentOffset, bytes, start, length);
+    }
+
+    @Override
+    public void setValueLengthSafe(int index, int length) {
+      final int offset = offsetVector.getAccessor().get(index);
+      while(data.capacity() < offset + length ) {
+        reAlloc();
+      }
+      offsetVector.getMutator().setSafe(index + 1, offsetVector.getAccessor().get(index) + length);
+    }
+
+
+    public void setSafe(int index, int start, int end, ArrowBuf buffer){
+      final int len = end - start;
+      final int outputStart = offsetVector.data.get${(minor.javaType!type.javaType)?cap_first}(index * ${type.width});
+
+      while(data.capacity() < outputStart + len) {
+        reAlloc();
+      }
+
+      offsetVector.getMutator().setSafe( index+1,  outputStart + len);
+      buffer.getBytes(start, data, outputStart, len);
+    }
+
+    public void setSafe(int index, Nullable${minor.class}Holder holder){
+      assert holder.isSet == 1;
+
+      final int start = holder.start;
+      final int end =   holder.end;
+      final int len = end - start;
+
+      int outputStart = offsetVector.data.get${(minor.javaType!type.javaType)?cap_first}(index * ${type.width});
+
+      while(data.capacity() < outputStart + len) {
+        reAlloc();
+      }
+
+      holder.buffer.getBytes(start, data, outputStart, len);
+      offsetVector.getMutator().setSafe( index+1,  outputStart + len);
+    }
+
+    public void setSafe(int index, ${minor.class}Holder holder){
+      final int start = holder.start;
+      final int end =   holder.end;
+      final int len = end - start;
+      final int outputStart = offsetVector.data.get${(minor.javaType!type.javaType)?cap_first}(index * ${type.width});
+
+      while(data.capacity() < outputStart + len) {
+        reAlloc();
+      }
+
+      holder.buffer.getBytes(start, data, outputStart, len);
+      offsetVector.getMutator().setSafe( index+1,  outputStart + len);
+    }
+
+    protected void set(int index, int start, int length, ArrowBuf buffer){
+      assert index >= 0;
+      final int currentOffset = offsetVector.getAccessor().get(index);
+      offsetVector.getMutator().set(index + 1, currentOffset + length);
+      final ArrowBuf bb = buffer.slice(start, length);
+      data.setBytes(currentOffset, bb);
+    }
+
+    protected void set(int index, Nullable${minor.class}Holder holder){
+      final int length = holder.end - holder.start;
+      final int currentOffset = offsetVector.getAccessor().get(index);
+      offsetVector.getMutator().set(index + 1, currentOffset + length);
+      data.setBytes(currentOffset, holder.buffer, holder.start, length);
+    }
+
+    protected void set(int index, ${minor.class}Holder holder){
+      final int length = holder.end - holder.start;
+      final int currentOffset = offsetVector.getAccessor().get(index);
+      offsetVector.getMutator().set(index + 1, currentOffset + length);
+      data.setBytes(currentOffset, holder.buffer, holder.start, length);
+    }
+
+    @Override
+    public void setValueCount(int valueCount) {
+      final int currentByteCapacity = getByteCapacity();
+      final int idx = offsetVector.getAccessor().get(valueCount);
+      data.writerIndex(idx);
+      if (valueCount > 0 && currentByteCapacity > idx * 2) {
+        incrementAllocationMonitor();
+      } else if (allocationMonitor > 0) {
+        allocationMonitor = 0;
+      }
+      VectorTrimmer.trim(data, idx);
+      offsetVector.getMutator().setValueCount(valueCount == 0 ? 0 : valueCount+1);
+    }
+
+    @Override
+    public void generateTestData(int size){
+      boolean even = true;
+      <#switch minor.class>
+      <#case "Var16Char">
+      final java.nio.charset.Charset charset = Charsets.UTF_16;
+      <#break>
+      <#case "VarChar">
+      <#default>
+      final java.nio.charset.Charset charset = Charsets.UTF_8;
+      </#switch>
+      final byte[] evenValue = new String("aaaaa").getBytes(charset);
+      final byte[] oddValue = new String("bbbbbbbbbb").getBytes(charset);
+      for(int i =0; i < size; i++, even = !even){
+        set(i, even ? evenValue : oddValue);
+        }
+      setValueCount(size);
+    }
+  }
+}
+
+</#if> <#-- type.major -->
+</#list>
+</#list>

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/AddOrGetResult.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/AddOrGetResult.java b/java/vector/src/main/java/org/apache/arrow/vector/AddOrGetResult.java
new file mode 100644
index 0000000..388eb9c
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/AddOrGetResult.java
@@ -0,0 +1,38 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector;
+
+import com.google.common.base.Preconditions;
+
+public class AddOrGetResult<V extends ValueVector> {
+  private final V vector;
+  private final boolean created;
+
+  public AddOrGetResult(V vector, boolean created) {
+    this.vector = Preconditions.checkNotNull(vector);
+    this.created = created;
+  }
+
+  public V getVector() {
+    return vector;
+  }
+
+  public boolean isCreated() {
+    return created;
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/AllocationHelper.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/AllocationHelper.java b/java/vector/src/main/java/org/apache/arrow/vector/AllocationHelper.java
new file mode 100644
index 0000000..54c3cd7
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/AllocationHelper.java
@@ -0,0 +1,61 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector;
+
+import org.apache.arrow.vector.complex.RepeatedFixedWidthVectorLike;
+import org.apache.arrow.vector.complex.RepeatedVariableWidthVectorLike;
+
+public class AllocationHelper {
+//  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(AllocationHelper.class);
+
+  public static void allocate(ValueVector v, int valueCount, int bytesPerValue) {
+    allocate(v, valueCount, bytesPerValue, 5);
+  }
+
+  public static void allocatePrecomputedChildCount(ValueVector v, int valueCount, int bytesPerValue, int childValCount){
+    if(v instanceof FixedWidthVector) {
+      ((FixedWidthVector) v).allocateNew(valueCount);
+    } else if (v instanceof VariableWidthVector) {
+      ((VariableWidthVector) v).allocateNew(valueCount * bytesPerValue, valueCount);
+    } else if(v instanceof RepeatedFixedWidthVectorLike) {
+      ((RepeatedFixedWidthVectorLike) v).allocateNew(valueCount, childValCount);
+    } else if(v instanceof RepeatedVariableWidthVectorLike) {
+      ((RepeatedVariableWidthVectorLike) v).allocateNew(childValCount * bytesPerValue, valueCount, childValCount);
+    } else {
+      v.allocateNew();
+    }
+  }
+
+  public static void allocate(ValueVector v, int valueCount, int bytesPerValue, int repeatedPerTop){
+    allocatePrecomputedChildCount(v, valueCount, bytesPerValue, repeatedPerTop * valueCount);
+  }
+
+  /**
+   * Allocates the exact amount if v is fixed width, otherwise falls back to dynamic allocation
+   * @param v value vector we are trying to allocate
+   * @param valueCount  size we are trying to allocate
+   * @throws org.apache.drill.exec.memory.OutOfMemoryException if it can't allocate the memory
+   */
+  public static void allocateNew(ValueVector v, int valueCount) {
+    if (v instanceof  FixedWidthVector) {
+      ((FixedWidthVector) v).allocateNew(valueCount);
+    } else {
+      v.allocateNew();
+    }
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/BaseDataValueVector.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/BaseDataValueVector.java b/java/vector/src/main/java/org/apache/arrow/vector/BaseDataValueVector.java
new file mode 100644
index 0000000..b129ea9
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/BaseDataValueVector.java
@@ -0,0 +1,91 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector;
+
+import io.netty.buffer.ArrowBuf;
+
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.vector.types.MaterializedField;
+
+
+public abstract class BaseDataValueVector extends BaseValueVector {
+
+  protected final static byte[] emptyByteArray = new byte[]{}; // Nullable vectors use this
+
+  protected ArrowBuf data;
+
+  public BaseDataValueVector(MaterializedField field, BufferAllocator allocator) {
+    super(field, allocator);
+    data = allocator.getEmpty();
+  }
+
+  @Override
+  public void clear() {
+    if (data != null) {
+      data.release();
+    }
+    data = allocator.getEmpty();
+    super.clear();
+  }
+
+  @Override
+  public void close() {
+    clear();
+    if (data != null) {
+      data.release();
+      data = null;
+    }
+    super.close();
+  }
+
+  @Override
+  public ArrowBuf[] getBuffers(boolean clear) {
+    ArrowBuf[] out;
+    if (getBufferSize() == 0) {
+      out = new ArrowBuf[0];
+    } else {
+      out = new ArrowBuf[]{data};
+      data.readerIndex(0);
+      if (clear) {
+        data.retain(1);
+      }
+    }
+    if (clear) {
+      clear();
+    }
+    return out;
+  }
+
+  @Override
+  public int getBufferSize() {
+    if (getAccessor().getValueCount() == 0) {
+      return 0;
+    }
+    return data.writerIndex();
+  }
+
+  public ArrowBuf getBuffer() {
+    return data;
+  }
+
+  /**
+   * This method has a similar effect of allocateNew() without actually clearing and reallocating
+   * the value vector. The purpose is to move the value vector to a "mutate" state
+   */
+  public void reset() {}
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/BaseValueVector.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/BaseValueVector.java b/java/vector/src/main/java/org/apache/arrow/vector/BaseValueVector.java
new file mode 100644
index 0000000..8bca3c0
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/BaseValueVector.java
@@ -0,0 +1,125 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector;
+
+import io.netty.buffer.ArrowBuf;
+
+import java.util.Iterator;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Iterators;
+
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.vector.types.MaterializedField;
+import org.apache.arrow.vector.util.TransferPair;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class BaseValueVector implements ValueVector {
+  private static final Logger logger = LoggerFactory.getLogger(BaseValueVector.class);
+
+  public static final int MAX_ALLOCATION_SIZE = Integer.MAX_VALUE;
+  public static final int INITIAL_VALUE_ALLOCATION = 4096;
+
+  protected final BufferAllocator allocator;
+  protected final MaterializedField field;
+
+  protected BaseValueVector(MaterializedField field, BufferAllocator allocator) {
+    this.field = Preconditions.checkNotNull(field, "field cannot be null");
+    this.allocator = Preconditions.checkNotNull(allocator, "allocator cannot be null");
+  }
+
+  @Override
+  public String toString() {
+    return super.toString() + "[field = " + field + ", ...]";
+  }
+
+  @Override
+  public void clear() {
+    getMutator().reset();
+  }
+
+  @Override
+  public void close() {
+    clear();
+  }
+
+  @Override
+  public MaterializedField getField() {
+    return field;
+  }
+
+  public MaterializedField getField(String ref){
+    return getField().withPath(ref);
+  }
+
+  @Override
+  public TransferPair getTransferPair(BufferAllocator allocator) {
+    return getTransferPair(getField().getPath(), allocator);
+  }
+
+//  public static SerializedField getMetadata(BaseValueVector vector) {
+//    return getMetadataBuilder(vector).build();
+//  }
+//
+//  protected static SerializedField.Builder getMetadataBuilder(BaseValueVector vector) {
+//    return SerializedFieldHelper.getAsBuilder(vector.getField())
+//        .setValueCount(vector.getAccessor().getValueCount())
+//        .setBufferLength(vector.getBufferSize());
+//  }
+
+  public abstract static class BaseAccessor implements ValueVector.Accessor {
+    protected BaseAccessor() { }
+
+    @Override
+    public boolean isNull(int index) {
+      return false;
+    }
+  }
+
+  public abstract static class BaseMutator implements ValueVector.Mutator {
+    protected BaseMutator() { }
+
+    @Override
+    public void generateTestData(int values) {}
+
+    //TODO: consider making mutator stateless(if possible) on another issue.
+    public void reset() {}
+  }
+
+  @Override
+  public Iterator<ValueVector> iterator() {
+    return Iterators.emptyIterator();
+  }
+
+  public static boolean checkBufRefs(final ValueVector vv) {
+    for(final ArrowBuf buffer : vv.getBuffers(false)) {
+      if (buffer.refCnt() <= 0) {
+        throw new IllegalStateException("zero refcount");
+      }
+    }
+
+    return true;
+  }
+
+  @Override
+  public BufferAllocator getAllocator() {
+    return allocator;
+  }
+}
+

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/BitVector.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/BitVector.java b/java/vector/src/main/java/org/apache/arrow/vector/BitVector.java
new file mode 100644
index 0000000..952e902
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/BitVector.java
@@ -0,0 +1,450 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector;
+
+import io.netty.buffer.ArrowBuf;
+
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.memory.OutOfMemoryException;
+import org.apache.arrow.vector.complex.impl.BitReaderImpl;
+import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.holders.BitHolder;
+import org.apache.arrow.vector.holders.NullableBitHolder;
+import org.apache.arrow.vector.types.MaterializedField;
+import org.apache.arrow.vector.util.OversizedAllocationException;
+import org.apache.arrow.vector.util.TransferPair;
+
+/**
+ * Bit implements a vector of bit-width values. Elements in the vector are accessed by position from the logical start
+ * of the vector. The width of each element is 1 bit. The equivalent Java primitive is an int containing the value '0'
+ * or '1'.
+ */
+public final class BitVector extends BaseDataValueVector implements FixedWidthVector {
+  static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(BitVector.class);
+
+  private final FieldReader reader = new BitReaderImpl(BitVector.this);
+  private final Accessor accessor = new Accessor();
+  private final Mutator mutator = new Mutator();
+
+  private int valueCount;
+  private int allocationSizeInBytes = INITIAL_VALUE_ALLOCATION;
+  private int allocationMonitor = 0;
+
+  public BitVector(MaterializedField field, BufferAllocator allocator) {
+    super(field, allocator);
+  }
+
+  @Override
+  public FieldReader getReader() {
+    return reader;
+  }
+
+  @Override
+  public int getBufferSize() {
+    return getSizeFromCount(valueCount);
+  }
+
+  @Override
+  public int getBufferSizeFor(final int valueCount) {
+    return getSizeFromCount(valueCount);
+  }
+
+  private int getSizeFromCount(int valueCount) {
+    return (int) Math.ceil(valueCount / 8.0);
+  }
+
+  @Override
+  public int getValueCapacity() {
+    return (int)Math.min((long)Integer.MAX_VALUE, data.capacity() * 8L);
+  }
+
+  private int getByteIndex(int index) {
+    return (int) Math.floor(index / 8.0);
+  }
+
+  @Override
+  public void setInitialCapacity(final int valueCount) {
+    allocationSizeInBytes = getSizeFromCount(valueCount);
+  }
+
+  @Override
+  public void allocateNew() {
+    if (!allocateNewSafe()) {
+      throw new OutOfMemoryException();
+    }
+  }
+
+  @Override
+  public boolean allocateNewSafe() {
+    long curAllocationSize = allocationSizeInBytes;
+    if (allocationMonitor > 10) {
+      curAllocationSize = Math.max(8, allocationSizeInBytes / 2);
+      allocationMonitor = 0;
+    } else if (allocationMonitor < -2) {
+      curAllocationSize = allocationSizeInBytes * 2L;
+      allocationMonitor = 0;
+    }
+
+    try {
+      allocateBytes(curAllocationSize);
+    } catch (OutOfMemoryException ex) {
+      return false;
+    }
+    return true;
+  }
+
+  @Override
+  public void reset() {
+    valueCount = 0;
+    allocationSizeInBytes = INITIAL_VALUE_ALLOCATION;
+    allocationMonitor = 0;
+    zeroVector();
+    super.reset();
+  }
+
+  /**
+   * Allocate a new memory space for this vector. Must be called prior to using the ValueVector.
+   *
+   * @param valueCount
+   *          The number of values which can be contained within this vector.
+   */
+  @Override
+  public void allocateNew(int valueCount) {
+    final int size = getSizeFromCount(valueCount);
+    allocateBytes(size);
+  }
+
+  private void allocateBytes(final long size) {
+    if (size > MAX_ALLOCATION_SIZE) {
+      throw new OversizedAllocationException("Requested amount of memory is more than max allowed allocation size");
+    }
+
+    final int curSize = (int) size;
+    clear();
+    data = allocator.buffer(curSize);
+    zeroVector();
+    allocationSizeInBytes = curSize;
+  }
+
+  /**
+   * Allocate new buffer with double capacity, and copy data into the new buffer. Replace vector's buffer with new buffer, and release old one
+   */
+  public void reAlloc() {
+    final long newAllocationSize = allocationSizeInBytes * 2L;
+    if (newAllocationSize > MAX_ALLOCATION_SIZE) {
+      throw new OversizedAllocationException("Requested amount of memory is more than max allowed allocation size");
+    }
+
+    final int curSize = (int)newAllocationSize;
+    final ArrowBuf newBuf = allocator.buffer(curSize);
+    newBuf.setZero(0, newBuf.capacity());
+    newBuf.setBytes(0, data, 0, data.capacity());
+    data.release();
+    data = newBuf;
+    allocationSizeInBytes = curSize;
+  }
+
+  /**
+   * {@inheritDoc}
+   */
+  @Override
+  public void zeroVector() {
+    data.setZero(0, data.capacity());
+  }
+
+  public void copyFrom(int inIndex, int outIndex, BitVector from) {
+    this.mutator.set(outIndex, from.accessor.get(inIndex));
+  }
+
+  public boolean copyFromSafe(int inIndex, int outIndex, BitVector from) {
+    if (outIndex >= this.getValueCapacity()) {
+      decrementAllocationMonitor();
+      return false;
+    }
+    copyFrom(inIndex, outIndex, from);
+    return true;
+  }
+
+//  @Override
+//  public void load(SerializedField metadata, DrillBuf buffer) {
+//    Preconditions.checkArgument(this.field.getPath().equals(metadata.getNamePart().getName()), "The field %s doesn't match the provided metadata %s.", this.field, metadata);
+//    final int valueCount = metadata.getValueCount();
+//    final int expectedLength = getSizeFromCount(valueCount);
+//    final int actualLength = metadata.getBufferLength();
+//    assert expectedLength == actualLength: "expected and actual buffer sizes do not match";
+//
+//    clear();
+//    data = buffer.slice(0, actualLength);
+//    data.retain();
+//    this.valueCount = valueCount;
+//  }
+
+  @Override
+  public Mutator getMutator() {
+    return new Mutator();
+  }
+
+  @Override
+  public Accessor getAccessor() {
+    return new Accessor();
+  }
+
+  @Override
+  public TransferPair getTransferPair(BufferAllocator allocator) {
+    return new TransferImpl(getField(), allocator);
+  }
+
+  @Override
+  public TransferPair getTransferPair(String ref, BufferAllocator allocator) {
+    return new TransferImpl(getField().withPath(ref), allocator);
+  }
+
+  @Override
+  public TransferPair makeTransferPair(ValueVector to) {
+    return new TransferImpl((BitVector) to);
+  }
+
+
+  public void transferTo(BitVector target) {
+    target.clear();
+    if (target.data != null) {
+      target.data.release();
+    }
+    target.data = data;
+    target.data.retain(1);
+    target.valueCount = valueCount;
+    clear();
+  }
+
+  public void splitAndTransferTo(int startIndex, int length, BitVector target) {
+    assert startIndex + length <= valueCount;
+    int firstByte = getByteIndex(startIndex);
+    int byteSize = getSizeFromCount(length);
+    int offset = startIndex % 8;
+    if (offset == 0) {
+      target.clear();
+      // slice
+      if (target.data != null) {
+        target.data.release();
+      }
+      target.data = (ArrowBuf) data.slice(firstByte, byteSize);
+      target.data.retain(1);
+    } else {
+      // Copy data
+      // When the first bit starts from the middle of a byte (offset != 0), copy data from src BitVector.
+      // Each byte in the target is composed by a part in i-th byte, another part in (i+1)-th byte.
+      // The last byte copied to target is a bit tricky :
+      //   1) if length requires partly byte (length % 8 !=0), copy the remaining bits only.
+      //   2) otherwise, copy the last byte in the same way as to the prior bytes.
+      target.clear();
+      target.allocateNew(length);
+      // TODO maybe do this one word at a time, rather than byte?
+      for(int i = 0; i < byteSize - 1; i++) {
+        target.data.setByte(i, (((this.data.getByte(firstByte + i) & 0xFF) >>> offset) + (this.data.getByte(firstByte + i + 1) <<  (8 - offset))));
+      }
+      if (length % 8 != 0) {
+        target.data.setByte(byteSize - 1, ((this.data.getByte(firstByte + byteSize - 1) & 0xFF) >>> offset));
+      } else {
+        target.data.setByte(byteSize - 1,
+            (((this.data.getByte(firstByte + byteSize - 1) & 0xFF) >>> offset) + (this.data.getByte(firstByte + byteSize) <<  (8 - offset))));
+      }
+    }
+    target.getMutator().setValueCount(length);
+  }
+
+  private class TransferImpl implements TransferPair {
+    BitVector to;
+
+    public TransferImpl(MaterializedField field, BufferAllocator allocator) {
+      this.to = new BitVector(field, allocator);
+    }
+
+    public TransferImpl(BitVector to) {
+      this.to = to;
+    }
+
+    @Override
+    public BitVector getTo() {
+      return to;
+    }
+
+    @Override
+    public void transfer() {
+      transferTo(to);
+    }
+
+    @Override
+    public void splitAndTransfer(int startIndex, int length) {
+      splitAndTransferTo(startIndex, length, to);
+    }
+
+    @Override
+    public void copyValueSafe(int fromIndex, int toIndex) {
+      to.copyFromSafe(fromIndex, toIndex, BitVector.this);
+    }
+  }
+
+  private void decrementAllocationMonitor() {
+    if (allocationMonitor > 0) {
+      allocationMonitor = 0;
+    }
+    --allocationMonitor;
+  }
+
+  private void incrementAllocationMonitor() {
+    ++allocationMonitor;
+  }
+
+  public class Accessor extends BaseAccessor {
+
+    /**
+     * Get the byte holding the desired bit, then mask all other bits. Iff the result is 0, the bit was not set.
+     *
+     * @param index
+     *          position of the bit in the vector
+     * @return 1 if set, otherwise 0
+     */
+    public final int get(int index) {
+      int byteIndex = index >> 3;
+      byte b = data.getByte(byteIndex);
+      int bitIndex = index & 7;
+      return Long.bitCount(b &  (1L << bitIndex));
+    }
+
+    @Override
+    public boolean isNull(int index) {
+      return false;
+    }
+
+    @Override
+    public final Boolean getObject(int index) {
+      return new Boolean(get(index) != 0);
+    }
+
+    @Override
+    public final int getValueCount() {
+      return valueCount;
+    }
+
+    public final void get(int index, BitHolder holder) {
+      holder.value = get(index);
+    }
+
+    public final void get(int index, NullableBitHolder holder) {
+      holder.isSet = 1;
+      holder.value = get(index);
+    }
+  }
+
+  /**
+   * MutableBit implements a vector of bit-width values. Elements in the vector are accessed by position from the
+   * logical start of the vector. Values should be pushed onto the vector sequentially, but may be randomly accessed.
+   *
+   * NB: this class is automatically generated from ValueVectorTypes.tdd using FreeMarker.
+   */
+  public class Mutator extends BaseMutator {
+
+    private Mutator() {
+    }
+
+    /**
+     * Set the bit at the given index to the specified value.
+     *
+     * @param index
+     *          position of the bit to set
+     * @param value
+     *          value to set (either 1 or 0)
+     */
+    public final void set(int index, int value) {
+      int byteIndex = index >> 3;
+      int bitIndex = index & 7;
+      byte currentByte = data.getByte(byteIndex);
+      byte bitMask = (byte) (1L << bitIndex);
+      if (value != 0) {
+        currentByte |= bitMask;
+      } else {
+        currentByte -= (bitMask & currentByte);
+      }
+
+      data.setByte(byteIndex, currentByte);
+    }
+
+    public final void set(int index, BitHolder holder) {
+      set(index, holder.value);
+    }
+
+    final void set(int index, NullableBitHolder holder) {
+      set(index, holder.value);
+    }
+
+    public void setSafe(int index, int value) {
+      while(index >= getValueCapacity()) {
+        reAlloc();
+      }
+      set(index, value);
+    }
+
+    public void setSafe(int index, BitHolder holder) {
+      while(index >= getValueCapacity()) {
+        reAlloc();
+      }
+      set(index, holder.value);
+    }
+
+    public void setSafe(int index, NullableBitHolder holder) {
+      while(index >= getValueCapacity()) {
+        reAlloc();
+      }
+      set(index, holder.value);
+    }
+
+    @Override
+    public final void setValueCount(int valueCount) {
+      int currentValueCapacity = getValueCapacity();
+      BitVector.this.valueCount = valueCount;
+      int idx = getSizeFromCount(valueCount);
+      while(valueCount > getValueCapacity()) {
+        reAlloc();
+      }
+      if (valueCount > 0 && currentValueCapacity > valueCount * 2) {
+        incrementAllocationMonitor();
+      } else if (allocationMonitor > 0) {
+        allocationMonitor = 0;
+      }
+      VectorTrimmer.trim(data, idx);
+    }
+
+    @Override
+    public final void generateTestData(int values) {
+      boolean even = true;
+      for(int i = 0; i < values; i++, even = !even) {
+        if (even) {
+          set(i, 1);
+        }
+      }
+      setValueCount(values);
+    }
+
+  }
+
+  @Override
+  public void clear() {
+    this.valueCount = 0;
+    super.clear();
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/FixedWidthVector.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/FixedWidthVector.java b/java/vector/src/main/java/org/apache/arrow/vector/FixedWidthVector.java
new file mode 100644
index 0000000..5905700
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/FixedWidthVector.java
@@ -0,0 +1,35 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector;
+
+
+public interface FixedWidthVector extends ValueVector{
+
+  /**
+   * Allocate a new memory space for this vector.  Must be called prior to using the ValueVector.
+   *
+   * @param valueCount   Number of values in the vector.
+   */
+  void allocateNew(int valueCount);
+
+/**
+ * Zero out the underlying buffer backing this vector.
+ */
+  void zeroVector();
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/NullableVector.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/NullableVector.java b/java/vector/src/main/java/org/apache/arrow/vector/NullableVector.java
new file mode 100644
index 0000000..00c33fc
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/NullableVector.java
@@ -0,0 +1,23 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector;
+
+public interface NullableVector extends ValueVector{
+
+  ValueVector getValuesVector();
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/NullableVectorDefinitionSetter.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/NullableVectorDefinitionSetter.java b/java/vector/src/main/java/org/apache/arrow/vector/NullableVectorDefinitionSetter.java
new file mode 100644
index 0000000..b819c5d
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/NullableVectorDefinitionSetter.java
@@ -0,0 +1,23 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector;
+
+public interface NullableVectorDefinitionSetter {
+
+  public void setIndexDefined(int index);
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/ObjectVector.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/ObjectVector.java b/java/vector/src/main/java/org/apache/arrow/vector/ObjectVector.java
new file mode 100644
index 0000000..b806b18
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/ObjectVector.java
@@ -0,0 +1,220 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector;
+
+import io.netty.buffer.ArrowBuf;
+
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.List;
+
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.memory.OutOfMemoryException;
+import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.holders.ObjectHolder;
+import org.apache.arrow.vector.types.MaterializedField;
+import org.apache.arrow.vector.util.TransferPair;
+
+public class ObjectVector extends BaseValueVector {
+  private final Accessor accessor = new Accessor();
+  private final Mutator mutator = new Mutator();
+  private int maxCount = 0;
+  private int count = 0;
+  private int allocationSize = 4096;
+
+  private List<Object[]> objectArrayList = new ArrayList<>();
+
+  public ObjectVector(MaterializedField field, BufferAllocator allocator) {
+    super(field, allocator);
+  }
+
+  public void addNewArray() {
+    objectArrayList.add(new Object[allocationSize]);
+    maxCount += allocationSize;
+  }
+
+  @Override
+  public FieldReader getReader() {
+    throw new UnsupportedOperationException("ObjectVector does not support this");
+  }
+
+  public final class Mutator implements ValueVector.Mutator {
+
+    public void set(int index, Object obj) {
+      int listOffset = index / allocationSize;
+      if (listOffset >= objectArrayList.size()) {
+        addNewArray();
+      }
+      objectArrayList.get(listOffset)[index % allocationSize] = obj;
+    }
+
+    public boolean setSafe(int index, long value) {
+      set(index, value);
+      return true;
+    }
+
+    protected void set(int index, ObjectHolder holder) {
+      set(index, holder.obj);
+    }
+
+    public boolean setSafe(int index, ObjectHolder holder){
+      set(index, holder);
+      return true;
+    }
+
+    @Override
+    public void setValueCount(int valueCount) {
+      count = valueCount;
+    }
+
+    @Override
+    public void reset() {
+      count = 0;
+      maxCount = 0;
+      objectArrayList = new ArrayList<>();
+      addNewArray();
+    }
+
+    @Override
+    public void generateTestData(int values) {
+    }
+  }
+
+  @Override
+  public void setInitialCapacity(int numRecords) {
+    // NoOp
+  }
+
+  @Override
+  public void allocateNew() throws OutOfMemoryException {
+    addNewArray();
+  }
+
+  public void allocateNew(int valueCount) throws OutOfMemoryException {
+    while (maxCount < valueCount) {
+      addNewArray();
+    }
+  }
+
+  @Override
+  public boolean allocateNewSafe() {
+    allocateNew();
+    return true;
+  }
+
+  @Override
+  public int getBufferSize() {
+    throw new UnsupportedOperationException("ObjectVector does not support this");
+  }
+
+  @Override
+  public int getBufferSizeFor(final int valueCount) {
+    throw new UnsupportedOperationException("ObjectVector does not support this");
+  }
+
+  @Override
+  public void close() {
+    clear();
+  }
+
+  @Override
+  public void clear() {
+    objectArrayList.clear();
+    maxCount = 0;
+    count = 0;
+  }
+
+  @Override
+  public MaterializedField getField() {
+    return field;
+  }
+
+  @Override
+  public TransferPair getTransferPair(BufferAllocator allocator) {
+    throw new UnsupportedOperationException("ObjectVector does not support this");
+  }
+
+  @Override
+  public TransferPair makeTransferPair(ValueVector to) {
+    throw new UnsupportedOperationException("ObjectVector does not support this");
+  }
+
+  @Override
+  public TransferPair getTransferPair(String ref, BufferAllocator allocator) {
+    throw new UnsupportedOperationException("ObjectVector does not support this");
+  }
+
+  @Override
+  public int getValueCapacity() {
+    return maxCount;
+  }
+
+  @Override
+  public Accessor getAccessor() {
+    return accessor;
+  }
+
+  @Override
+  public ArrowBuf[] getBuffers(boolean clear) {
+    throw new UnsupportedOperationException("ObjectVector does not support this");
+  }
+
+//  @Override
+//  public void load(UserBitShared.SerializedField metadata, DrillBuf buffer) {
+//    throw new UnsupportedOperationException("ObjectVector does not support this");
+//  }
+//
+//  @Override
+//  public UserBitShared.SerializedField getMetadata() {
+//    throw new UnsupportedOperationException("ObjectVector does not support this");
+//  }
+
+  @Override
+  public Mutator getMutator() {
+    return mutator;
+  }
+
+  @Override
+  public Iterator<ValueVector> iterator() {
+    throw new UnsupportedOperationException("ObjectVector does not support this");
+  }
+
+  public final class Accessor extends BaseAccessor {
+    @Override
+    public Object getObject(int index) {
+      int listOffset = index / allocationSize;
+      if (listOffset >= objectArrayList.size()) {
+        addNewArray();
+      }
+      return objectArrayList.get(listOffset)[index % allocationSize];
+    }
+
+    @Override
+    public int getValueCount() {
+      return count;
+    }
+
+    public Object get(int index) {
+      return getObject(index);
+    }
+
+    public void get(int index, ObjectHolder holder){
+      holder.obj = getObject(index);
+    }
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/SchemaChangeCallBack.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/SchemaChangeCallBack.java b/java/vector/src/main/java/org/apache/arrow/vector/SchemaChangeCallBack.java
new file mode 100644
index 0000000..fc0a066
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/SchemaChangeCallBack.java
@@ -0,0 +1,52 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.vector;
+
+import org.apache.arrow.vector.util.CallBack;
+
+
+public class SchemaChangeCallBack implements CallBack {
+  private boolean schemaChanged = false;
+
+  /**
+   * Constructs a schema-change callback with the schema-changed state set to
+   * {@code false}.
+   */
+  public SchemaChangeCallBack() {
+  }
+
+  /**
+   * Sets the schema-changed state to {@code true}.
+   */
+  @Override
+  public void doWork() {
+    schemaChanged = true;
+  }
+
+  /**
+   * Returns the value of schema-changed state, <strong>resetting</strong> the
+   * schema-changed state to {@code false}.
+   */
+  public boolean getSchemaChangedAndReset() {
+    final boolean current = schemaChanged;
+    schemaChanged = false;
+    return current;
+  }
+}
+


[17/17] arrow git commit: ARROW-4: This provides an partial C++11 implementation of the Apache Arrow data structures along with a cmake-based build system. The codebase generally follows Google C++ style guide, but more cleaning to be more conforming is

Posted by ja...@apache.org.
ARROW-4: This provides an partial C++11 implementation of the Apache Arrow data structures along with a cmake-based build system. The codebase generally follows Google C++ style guide, but more cleaning to be more conforming is needed. It uses googletest for unit testing.

Feature-wise, this patch includes:

* A small logical data type object model
* Immutable array accessor containers for fixed-width primitive and list types
* A String array container implemented as a List<byte>
* Builder classes for the primitive arrays and list types
* A simple memory management model using immutable and immutable buffers and
  C++ RAII idioms
* Modest unit test coverage for the above features.


Project: http://git-wip-us.apache.org/repos/asf/arrow/repo
Commit: http://git-wip-us.apache.org/repos/asf/arrow/commit/23c4b08d
Tree: http://git-wip-us.apache.org/repos/asf/arrow/tree/23c4b08d
Diff: http://git-wip-us.apache.org/repos/asf/arrow/diff/23c4b08d

Branch: refs/heads/master
Commit: 23c4b08d154f8079806a1f0258d7e4af17bdf5fd
Parents: 16e44e3
Author: Wes McKinney <we...@cloudera.com>
Authored: Tue Feb 16 17:56:05 2016 -0800
Committer: Jacques Nadeau <ja...@apache.org>
Committed: Wed Feb 17 04:39:03 2016 -0800

----------------------------------------------------------------------
 cpp/.gitignore                            |   21 +
 cpp/CMakeLists.txt                        |  483 ++
 cpp/LICENSE.txt                           |  202 +
 cpp/README.md                             |   48 +
 cpp/build-support/asan_symbolize.py       |  360 ++
 cpp/build-support/bootstrap_toolchain.py  |  114 +
 cpp/build-support/cpplint.py              | 6323 ++++++++++++++++++++++++
 cpp/build-support/run-test.sh             |  195 +
 cpp/build-support/stacktrace_addr2line.pl |   92 +
 cpp/cmake_modules/CompilerInfo.cmake      |   46 +
 cpp/cmake_modules/FindGPerf.cmake         |   69 +
 cpp/cmake_modules/FindGTest.cmake         |   91 +
 cpp/cmake_modules/FindParquet.cmake       |   80 +
 cpp/cmake_modules/san-config.cmake        |   92 +
 cpp/setup_build_env.sh                    |   12 +
 cpp/src/arrow/CMakeLists.txt              |   33 +
 cpp/src/arrow/api.h                       |   21 +
 cpp/src/arrow/array-test.cc               |   92 +
 cpp/src/arrow/array.cc                    |   44 +
 cpp/src/arrow/array.h                     |   79 +
 cpp/src/arrow/builder.cc                  |   63 +
 cpp/src/arrow/builder.h                   |  101 +
 cpp/src/arrow/field-test.cc               |   38 +
 cpp/src/arrow/field.h                     |   48 +
 cpp/src/arrow/parquet/CMakeLists.txt      |   35 +
 cpp/src/arrow/test-util.h                 |   97 +
 cpp/src/arrow/type.cc                     |   22 +
 cpp/src/arrow/type.h                      |  180 +
 cpp/src/arrow/types/CMakeLists.txt        |   63 +
 cpp/src/arrow/types/binary.h              |   33 +
 cpp/src/arrow/types/boolean.h             |   35 +
 cpp/src/arrow/types/collection.h          |   45 +
 cpp/src/arrow/types/construct.cc          |   88 +
 cpp/src/arrow/types/construct.h           |   32 +
 cpp/src/arrow/types/datetime.h            |   79 +
 cpp/src/arrow/types/decimal.h             |   32 +
 cpp/src/arrow/types/floating.cc           |   22 +
 cpp/src/arrow/types/floating.h            |   43 +
 cpp/src/arrow/types/integer.cc            |   22 +
 cpp/src/arrow/types/integer.h             |   88 +
 cpp/src/arrow/types/json.cc               |   42 +
 cpp/src/arrow/types/json.h                |   38 +
 cpp/src/arrow/types/list-test.cc          |  166 +
 cpp/src/arrow/types/list.cc               |   31 +
 cpp/src/arrow/types/list.h                |  206 +
 cpp/src/arrow/types/null.h                |   34 +
 cpp/src/arrow/types/primitive-test.cc     |  345 ++
 cpp/src/arrow/types/primitive.cc          |   50 +
 cpp/src/arrow/types/primitive.h           |  240 +
 cpp/src/arrow/types/string-test.cc        |  242 +
 cpp/src/arrow/types/string.cc             |   40 +
 cpp/src/arrow/types/string.h              |  181 +
 cpp/src/arrow/types/struct-test.cc        |   61 +
 cpp/src/arrow/types/struct.cc             |   38 +
 cpp/src/arrow/types/struct.h              |   51 +
 cpp/src/arrow/types/test-common.h         |   50 +
 cpp/src/arrow/types/union.cc              |   49 +
 cpp/src/arrow/types/union.h               |   86 +
 cpp/src/arrow/util/CMakeLists.txt         |   81 +
 cpp/src/arrow/util/bit-util-test.cc       |   44 +
 cpp/src/arrow/util/bit-util.cc            |   46 +
 cpp/src/arrow/util/bit-util.h             |   68 +
 cpp/src/arrow/util/buffer-test.cc         |   58 +
 cpp/src/arrow/util/buffer.cc              |   53 +
 cpp/src/arrow/util/buffer.h               |  133 +
 cpp/src/arrow/util/macros.h               |   26 +
 cpp/src/arrow/util/random.h               |  128 +
 cpp/src/arrow/util/status.cc              |   38 +
 cpp/src/arrow/util/status.h               |  152 +
 cpp/src/arrow/util/test_main.cc           |   26 +
 cpp/thirdparty/build_thirdparty.sh        |   62 +
 cpp/thirdparty/download_thirdparty.sh     |   20 +
 cpp/thirdparty/versions.sh                |    3 +
 73 files changed, 12551 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/.gitignore
----------------------------------------------------------------------
diff --git a/cpp/.gitignore b/cpp/.gitignore
new file mode 100644
index 0000000..ab30247
--- /dev/null
+++ b/cpp/.gitignore
@@ -0,0 +1,21 @@
+thirdparty/
+CMakeFiles/
+CMakeCache.txt
+CTestTestfile.cmake
+Makefile
+cmake_install.cmake
+build/
+Testing/
+
+#########################################
+# Editor temporary/working/backup files #
+.#*
+*\#*\#
+[#]*#
+*~
+*$
+*.bak
+*flymake*
+*.kdev4
+*.log
+*.swp

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/CMakeLists.txt
----------------------------------------------------------------------
diff --git a/cpp/CMakeLists.txt b/cpp/CMakeLists.txt
new file mode 100644
index 0000000..90e55df
--- /dev/null
+++ b/cpp/CMakeLists.txt
@@ -0,0 +1,483 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+cmake_minimum_required(VERSION 2.7)
+project(arrow)
+
+set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} "${CMAKE_SOURCE_DIR}/cmake_modules")
+
+include(CMakeParseArguments)
+
+set(BUILD_SUPPORT_DIR "${CMAKE_SOURCE_DIR}/build-support")
+set(THIRDPARTY_DIR "${CMAKE_SOURCE_DIR}/thirdparty")
+
+# Allow "make install" to not depend on all targets.
+#
+# Must be declared in the top-level CMakeLists.txt.
+set(CMAKE_SKIP_INSTALL_ALL_DEPENDENCY true)
+
+# Generate a Clang compile_commands.json "compilation database" file for use
+# with various development tools, such as Vim's YouCompleteMe plugin.
+# See http://clang.llvm.org/docs/JSONCompilationDatabase.html
+if ("$ENV{CMAKE_EXPORT_COMPILE_COMMANDS}" STREQUAL "1")
+  set(CMAKE_EXPORT_COMPILE_COMMANDS 1)
+endif()
+
+# Enable using a custom GCC toolchain to build Arrow
+if (NOT "$ENV{ARROW_GCC_ROOT}" STREQUAL "")
+  set(GCC_ROOT $ENV{ARROW_GCC_ROOT})
+  set(CMAKE_C_COMPILER ${GCC_ROOT}/bin/gcc)
+  set(CMAKE_CXX_COMPILER ${GCC_ROOT}/bin/g++)
+endif()
+
+# ----------------------------------------------------------------------
+# cmake options
+
+# Top level cmake dir
+if("${CMAKE_SOURCE_DIR}" STREQUAL "${CMAKE_CURRENT_SOURCE_DIR}")
+  option(ARROW_WITH_PARQUET
+    "Build the Parquet adapter and link to libparquet"
+    OFF)
+
+  option(ARROW_BUILD_TESTS
+    "Build the Arrow googletest unit tests"
+    ON)
+endif()
+
+if(NOT ARROW_BUILD_TESTS)
+  set(NO_TESTS 1)
+endif()
+
+
+############################################################
+# Compiler flags
+############################################################
+
+# compiler flags that are common across debug/release builds
+#  - msse4.2: Enable sse4.2 compiler intrinsics.
+#  - Wall: Enable all warnings.
+#  - Wno-sign-compare: suppress warnings for comparison between signed and unsigned
+#    integers
+#  -Wno-deprecated: some of the gutil code includes old things like ext/hash_set, ignore that
+#  - pthread: enable multithreaded malloc
+#  - -D__STDC_FORMAT_MACROS: for PRI* print format macros
+#  -fno-strict-aliasing
+#     Assume programs do not follow strict aliasing rules.
+#     GCC cannot always verify whether strict aliasing rules are indeed followed due to
+#     fundamental limitations in escape analysis, which can result in subtle bad code generation.
+#     This has a small perf hit but worth it to avoid hard to debug crashes.
+set(CXX_COMMON_FLAGS "-std=c++11 -fno-strict-aliasing -msse3 -Wall -Wno-deprecated -pthread -D__STDC_FORMAT_MACROS")
+
+# compiler flags for different build types (run 'cmake -DCMAKE_BUILD_TYPE=<type> .')
+# For all builds:
+# For CMAKE_BUILD_TYPE=Debug
+#   -ggdb: Enable gdb debugging
+# For CMAKE_BUILD_TYPE=FastDebug
+#   Same as DEBUG, except with some optimizations on.
+# For CMAKE_BUILD_TYPE=Release
+#   -O3: Enable all compiler optimizations
+#   -g: Enable symbols for profiler tools (TODO: remove for shipping)
+set(CXX_FLAGS_DEBUG "-ggdb")
+set(CXX_FLAGS_FASTDEBUG "-ggdb -O1")
+set(CXX_FLAGS_RELEASE "-O3 -g -DNDEBUG")
+
+set(CXX_FLAGS_PROFILE_GEN "${CXX_FLAGS_RELEASE} -fprofile-generate")
+set(CXX_FLAGS_PROFILE_BUILD "${CXX_FLAGS_RELEASE} -fprofile-use")
+
+# if no build build type is specified, default to debug builds
+if (NOT CMAKE_BUILD_TYPE)
+  set(CMAKE_BUILD_TYPE Debug)
+endif(NOT CMAKE_BUILD_TYPE)
+
+string (TOUPPER ${CMAKE_BUILD_TYPE} CMAKE_BUILD_TYPE)
+
+
+# Set compile flags based on the build type.
+message("Configured for ${CMAKE_BUILD_TYPE} build (set with cmake -DCMAKE_BUILD_TYPE={release,debug,...})")
+if ("${CMAKE_BUILD_TYPE}" STREQUAL "DEBUG")
+  set(CMAKE_CXX_FLAGS ${CXX_FLAGS_DEBUG})
+elseif ("${CMAKE_BUILD_TYPE}" STREQUAL "FASTDEBUG")
+  set(CMAKE_CXX_FLAGS ${CXX_FLAGS_FASTDEBUG})
+elseif ("${CMAKE_BUILD_TYPE}" STREQUAL "RELEASE")
+  set(CMAKE_CXX_FLAGS ${CXX_FLAGS_RELEASE})
+elseif ("${CMAKE_BUILD_TYPE}" STREQUAL "PROFILE_GEN")
+  set(CMAKE_CXX_FLAGS ${CXX_FLAGS_PROFILE_GEN})
+elseif ("${CMAKE_BUILD_TYPE}" STREQUAL "PROFILE_BUILD")
+  set(CMAKE_CXX_FLAGS ${CXX_FLAGS_PROFILE_BUILD})
+else()
+  message(FATAL_ERROR "Unknown build type: ${CMAKE_BUILD_TYPE}")
+endif ()
+
+# Add common flags
+set(CMAKE_CXX_FLAGS "${CXX_COMMON_FLAGS} ${CMAKE_CXX_FLAGS}")
+
+# Required to avoid static linking errors with dependencies
+add_definitions(-fPIC)
+
+# Determine compiler version
+include(CompilerInfo)
+
+if ("${COMPILER_FAMILY}" STREQUAL "clang")
+  # Clang helpfully provides a few extensions from C++11 such as the 'override'
+  # keyword on methods. This doesn't change behavior, and we selectively enable
+  # it in src/gutil/port.h only on clang. So, we can safely use it, and don't want
+  # to trigger warnings when we do so.
+  # set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-c++11-extensions")
+
+  # Using Clang with ccache causes a bunch of spurious warnings that are
+  # purportedly fixed in the next version of ccache. See the following for details:
+  #
+  #   http://petereisentraut.blogspot.com/2011/05/ccache-and-clang.html
+  #   http://petereisentraut.blogspot.com/2011/09/ccache-and-clang-part-2.html
+  set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Qunused-arguments")
+
+  # Only hardcode -fcolor-diagnostics if stderr is opened on a terminal. Otherwise
+  # the color codes show up as noisy artifacts.
+  #
+  # This test is imperfect because 'cmake' and 'make' can be run independently
+  # (with different terminal options), and we're testing during the former.
+  execute_process(COMMAND test -t 2 RESULT_VARIABLE ARROW_IS_TTY)
+  if ((${ARROW_IS_TTY} EQUAL 0) AND (NOT ("$ENV{TERM}" STREQUAL "dumb")))
+    message("Running in a controlling terminal")
+    set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fcolor-diagnostics")
+  else()
+    message("Running without a controlling terminal or in a dumb terminal")
+  endif()
+
+  # Use libstdc++ and not libc++. The latter lacks support for tr1 in OSX
+  # and since 10.9 is now the default.
+  set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -stdlib=libstdc++")
+endif()
+
+# Sanity check linking option.
+if (NOT ARROW_LINK)
+  set(ARROW_LINK "d")
+elseif(NOT ("auto" MATCHES "^${ARROW_LINK}" OR
+            "dynamic" MATCHES "^${ARROW_LINK}" OR
+            "static" MATCHES "^${ARROW_LINK}"))
+  message(FATAL_ERROR "Unknown value for ARROW_LINK, must be auto|dynamic|static")
+else()
+  # Remove all but the first letter.
+  string(SUBSTRING "${ARROW_LINK}" 0 1 ARROW_LINK)
+endif()
+
+# ASAN / TSAN / UBSAN
+include(san-config)
+
+# For any C code, use the same flags.
+set(CMAKE_C_FLAGS "${CMAKE_CXX_FLAGS}")
+
+# Code coverage
+if ("${ARROW_GENERATE_COVERAGE}")
+  if("${CMAKE_CXX_COMPILER}" MATCHES ".*clang.*")
+    # There appears to be some bugs in clang 3.3 which cause code coverage
+    # to have link errors, not locating the llvm_gcda_* symbols.
+    # This should be fixed in llvm 3.4 with http://llvm.org/viewvc/llvm-project?view=revision&revision=184666
+    message(SEND_ERROR "Cannot currently generate coverage with clang")
+  endif()
+  set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} --coverage -DCOVERAGE_BUILD")
+
+  # For coverage to work properly, we need to use static linkage. Otherwise,
+  # __gcov_flush() doesn't properly flush coverage from every module.
+  # See http://stackoverflow.com/questions/28164543/using-gcov-flush-within-a-library-doesnt-force-the-other-modules-to-yield-gc
+  if("${ARROW_LINK}" STREQUAL "a")
+    message("Using static linking for coverage build")
+    set(ARROW_LINK "s")
+  elseif("${ARROW_LINK}" STREQUAL "d")
+    message(SEND_ERROR "Cannot use coverage with dynamic linking")
+  endif()
+endif()
+
+# If we still don't know what kind of linking to perform, choose based on
+# build type (developers like fast builds).
+if ("${ARROW_LINK}" STREQUAL "a")
+  if ("${CMAKE_BUILD_TYPE}" STREQUAL "DEBUG" OR
+      "${CMAKE_BUILD_TYPE}" STREQUAL "FASTDEBUG")
+    message("Using dynamic linking for ${CMAKE_BUILD_TYPE} builds")
+    set(ARROW_LINK "d")
+  else()
+    message("Using static linking for ${CMAKE_BUILD_TYPE} builds")
+    set(ARROW_LINK "s")
+  endif()
+endif()
+
+# Are we using the gold linker? It doesn't work with dynamic linking as
+# weak symbols aren't properly overridden, causing tcmalloc to be omitted.
+# Let's flag this as an error in RELEASE builds (we shouldn't release a
+# product like this).
+#
+# See https://sourceware.org/bugzilla/show_bug.cgi?id=16979 for details.
+#
+# The gold linker is only for ELF binaries, which OSX doesn't use. We can
+# just skip.
+if (NOT APPLE)
+  execute_process(COMMAND ${CMAKE_CXX_COMPILER} -Wl,--version OUTPUT_VARIABLE LINKER_OUTPUT)
+endif ()
+if (LINKER_OUTPUT MATCHES "gold")
+  if ("${ARROW_LINK}" STREQUAL "d" AND
+      "${CMAKE_BUILD_TYPE}" STREQUAL "RELEASE")
+    message(SEND_ERROR "Cannot use gold with dynamic linking in a RELEASE build "
+      "as it would cause tcmalloc symbols to get dropped")
+  else()
+    message("Using gold linker")
+  endif()
+  set(ARROW_USING_GOLD 1)
+else()
+  message("Using ld linker")
+endif()
+
+# Having set ARROW_LINK due to build type and/or sanitizer, it's now safe to
+# act on its value.
+if ("${ARROW_LINK}" STREQUAL "d")
+  set(BUILD_SHARED_LIBS ON)
+
+  # Position independent code is only necessary when producing shared objects.
+  add_definitions(-fPIC)
+endif()
+
+# set compile output directory
+string (TOLOWER ${CMAKE_BUILD_TYPE} BUILD_SUBDIR_NAME)
+
+# If build in-source, create the latest symlink. If build out-of-source, which is
+# preferred, simply output the binaries in the build folder
+if (${CMAKE_SOURCE_DIR} STREQUAL ${CMAKE_CURRENT_BINARY_DIR})
+  set(BUILD_OUTPUT_ROOT_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}/build/${BUILD_SUBDIR_NAME}/")
+  # Link build/latest to the current build directory, to avoid developers
+  # accidentally running the latest debug build when in fact they're building
+  # release builds.
+  FILE(MAKE_DIRECTORY ${BUILD_OUTPUT_ROOT_DIRECTORY})
+  if (NOT APPLE)
+    set(MORE_ARGS "-T")
+  endif()
+EXECUTE_PROCESS(COMMAND ln ${MORE_ARGS} -sf ${BUILD_OUTPUT_ROOT_DIRECTORY}
+  ${CMAKE_CURRENT_BINARY_DIR}/build/latest)
+else()
+  set(BUILD_OUTPUT_ROOT_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}/${BUILD_SUBDIR_NAME}/")
+endif()
+
+# where to put generated archives (.a files)
+set(CMAKE_ARCHIVE_OUTPUT_DIRECTORY "${BUILD_OUTPUT_ROOT_DIRECTORY}")
+set(ARCHIVE_OUTPUT_DIRECTORY "${BUILD_OUTPUT_ROOT_DIRECTORY}")
+
+# where to put generated libraries (.so files)
+set(CMAKE_LIBRARY_OUTPUT_DIRECTORY "${BUILD_OUTPUT_ROOT_DIRECTORY}")
+set(LIBRARY_OUTPUT_DIRECTORY "${BUILD_OUTPUT_ROOT_DIRECTORY}")
+
+# where to put generated binaries
+set(EXECUTABLE_OUTPUT_PATH "${BUILD_OUTPUT_ROOT_DIRECTORY}")
+include_directories(src)
+
+############################################################
+# Visibility
+############################################################
+# For generate_export_header() and add_compiler_export_flags().
+include(GenerateExportHeader)
+
+############################################################
+# Testing
+############################################################
+
+# Add a new test case, with or without an executable that should be built.
+#
+# REL_TEST_NAME is the name of the test. It may be a single component
+# (e.g. monotime-test) or contain additional components (e.g.
+# net/net_util-test). Either way, the last component must be a globally
+# unique name.
+#
+# Arguments after the test name will be passed to set_tests_properties().
+function(ADD_ARROW_TEST REL_TEST_NAME)
+  if(NO_TESTS)
+    return()
+  endif()
+  get_filename_component(TEST_NAME ${REL_TEST_NAME} NAME_WE)
+
+  if(EXISTS ${CMAKE_CURRENT_SOURCE_DIR}/${REL_TEST_NAME}.cc)
+    # This test has a corresponding .cc file, set it up as an executable.
+    set(TEST_PATH "${EXECUTABLE_OUTPUT_PATH}/${TEST_NAME}")
+    add_executable(${TEST_NAME} "${REL_TEST_NAME}.cc")
+    target_link_libraries(${TEST_NAME} ${ARROW_TEST_LINK_LIBS})
+  else()
+    # No executable, just invoke the test (probably a script) directly.
+    set(TEST_PATH ${CMAKE_CURRENT_SOURCE_DIR}/${REL_TEST_NAME})
+  endif()
+
+  add_test(${TEST_NAME}
+    ${BUILD_SUPPORT_DIR}/run-test.sh ${TEST_PATH})
+  if(ARGN)
+    set_tests_properties(${TEST_NAME} PROPERTIES ${ARGN})
+  endif()
+endfunction()
+
+# A wrapper for add_dependencies() that is compatible with NO_TESTS.
+function(ADD_ARROW_TEST_DEPENDENCIES REL_TEST_NAME)
+  if(NO_TESTS)
+    return()
+  endif()
+  get_filename_component(TEST_NAME ${REL_TEST_NAME} NAME_WE)
+
+  add_dependencies(${TEST_NAME} ${ARGN})
+endfunction()
+
+enable_testing()
+
+############################################################
+# Dependencies
+############################################################
+function(ADD_THIRDPARTY_LIB LIB_NAME)
+  set(options)
+  set(one_value_args SHARED_LIB STATIC_LIB)
+  set(multi_value_args DEPS)
+  cmake_parse_arguments(ARG "${options}" "${one_value_args}" "${multi_value_args}" ${ARGN})
+  if(ARG_UNPARSED_ARGUMENTS)
+    message(SEND_ERROR "Error: unrecognized arguments: ${ARG_UNPARSED_ARGUMENTS}")
+  endif()
+
+  if(("${ARROW_LINK}" STREQUAL "s" AND ARG_STATIC_LIB) OR (NOT ARG_SHARED_LIB))
+    if(NOT ARG_STATIC_LIB)
+      message(FATAL_ERROR "No static or shared library provided for ${LIB_NAME}")
+    endif()
+    add_library(${LIB_NAME} STATIC IMPORTED)
+    set_target_properties(${LIB_NAME}
+      PROPERTIES IMPORTED_LOCATION "${ARG_STATIC_LIB}")
+    message("Added static library dependency ${LIB_NAME}: ${ARG_STATIC_LIB}")
+  else()
+    add_library(${LIB_NAME} SHARED IMPORTED)
+    set_target_properties(${LIB_NAME}
+      PROPERTIES IMPORTED_LOCATION "${ARG_SHARED_LIB}")
+    message("Added shared library dependency ${LIB_NAME}: ${ARG_SHARED_LIB}")
+  endif()
+
+  if(ARG_DEPS)
+    set_target_properties(${LIB_NAME}
+      PROPERTIES IMPORTED_LINK_INTERFACE_LIBRARIES "${ARG_DEPS}")
+  endif()
+endfunction()
+
+## GTest
+if ("$ENV{GTEST_HOME}" STREQUAL "")
+  set(GTest_HOME ${THIRDPARTY_DIR}/googletest-release-1.7.0)
+endif()
+find_package(GTest REQUIRED)
+include_directories(SYSTEM ${GTEST_INCLUDE_DIR})
+ADD_THIRDPARTY_LIB(gtest
+  STATIC_LIB ${GTEST_STATIC_LIB})
+
+## Google PerfTools
+##
+## Disabled with TSAN/ASAN as well as with gold+dynamic linking (see comment
+## near definition of ARROW_USING_GOLD).
+# find_package(GPerf REQUIRED)
+# if (NOT "${ARROW_USE_ASAN}" AND
+#     NOT "${ARROW_USE_TSAN}" AND
+#     NOT ("${ARROW_USING_GOLD}" AND "${ARROW_LINK}" STREQUAL "d"))
+#   ADD_THIRDPARTY_LIB(tcmalloc
+#     STATIC_LIB "${TCMALLOC_STATIC_LIB}"
+#     SHARED_LIB "${TCMALLOC_SHARED_LIB}")
+#   ADD_THIRDPARTY_LIB(profiler
+#     STATIC_LIB "${PROFILER_STATIC_LIB}"
+#     SHARED_LIB "${PROFILER_SHARED_LIB}")
+#   list(APPEND ARROW_BASE_LIBS tcmalloc profiler)
+#   add_definitions("-DTCMALLOC_ENABLED")
+#   set(ARROW_TCMALLOC_AVAILABLE 1)
+# endif()
+
+############################################################
+# Linker setup
+############################################################
+set(ARROW_MIN_TEST_LIBS arrow arrow_test_main arrow_test_util ${ARROW_BASE_LIBS})
+set(ARROW_TEST_LINK_LIBS ${ARROW_MIN_TEST_LIBS})
+
+############################################################
+# "make ctags" target
+############################################################
+if (UNIX)
+  add_custom_target(ctags ctags -R --languages=c++,c)
+endif (UNIX)
+
+############################################################
+# "make etags" target
+############################################################
+if (UNIX)
+  add_custom_target(tags etags --members --declarations
+  `find ${CMAKE_CURRENT_SOURCE_DIR}/src
+   -name \\*.cc -or -name \\*.hh -or -name \\*.cpp -or -name \\*.h -or -name \\*.c -or
+   -name \\*.f`)
+  add_custom_target(etags DEPENDS tags)
+endif (UNIX)
+
+############################################################
+# "make cscope" target
+############################################################
+if (UNIX)
+  add_custom_target(cscope find ${CMAKE_CURRENT_SOURCE_DIR}
+  ( -name \\*.cc -or -name \\*.hh -or -name \\*.cpp -or
+    -name \\*.h -or -name \\*.c -or -name \\*.f )
+  -exec echo \"{}\" \; > cscope.files && cscope -q -b VERBATIM)
+endif (UNIX)
+
+############################################################
+# "make lint" target
+############################################################
+if (UNIX)
+  # Full lint
+  add_custom_target(lint ${BUILD_SUPPORT_DIR}/cpplint.py
+  --verbose=2
+  --linelength=90
+  --filter=-whitespace/comments,-readability/todo,-build/header_guard
+    `find ${CMAKE_CURRENT_SOURCE_DIR}/src -name \\*.cc -or -name \\*.h`)
+endif (UNIX)
+
+#----------------------------------------------------------------------
+# Parquet adapter
+
+if(ARROW_WITH_PARQUET)
+  find_package(Parquet REQUIRED)
+  include_directories(SYSTEM ${PARQUET_INCLUDE_DIR})
+  ADD_THIRDPARTY_LIB(parquet
+    STATIC_LIB ${PARQUET_STATIC_LIB}
+    SHARED_LIB ${PARQUET_SHARED_LIB})
+
+  add_subdirectory(src/arrow/parquet)
+  list(APPEND LINK_LIBS arrow_parquet parquet)
+endif()
+
+############################################################
+# Subdirectories
+############################################################
+
+add_subdirectory(src/arrow)
+add_subdirectory(src/arrow/util)
+add_subdirectory(src/arrow/types)
+
+set(LINK_LIBS
+  arrow_util
+  arrow_types)
+
+set(ARROW_SRCS
+  src/arrow/array.cc
+  src/arrow/builder.cc
+  src/arrow/type.cc
+)
+
+add_library(arrow SHARED
+  ${ARROW_SRCS}
+)
+target_link_libraries(arrow ${LINK_LIBS})
+set_target_properties(arrow PROPERTIES LINKER_LANGUAGE CXX)
+
+install(TARGETS arrow
+  LIBRARY DESTINATION lib)

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/LICENSE.txt
----------------------------------------------------------------------
diff --git a/cpp/LICENSE.txt b/cpp/LICENSE.txt
new file mode 100644
index 0000000..d645695
--- /dev/null
+++ b/cpp/LICENSE.txt
@@ -0,0 +1,202 @@
+
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+   1. Definitions.
+
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+
+   END OF TERMS AND CONDITIONS
+
+   APPENDIX: How to apply the Apache License to your work.
+
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+
+   Copyright [yyyy] [name of copyright owner]
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/README.md
----------------------------------------------------------------------
diff --git a/cpp/README.md b/cpp/README.md
new file mode 100644
index 0000000..378dc4e
--- /dev/null
+++ b/cpp/README.md
@@ -0,0 +1,48 @@
+# Arrow C++
+
+## Setup Build Environment
+
+Arrow uses CMake as a build configuration system. Currently, it supports in-source and
+out-of-source builds with the latter one being preferred.
+
+Arrow requires a C++11-enabled compiler. On Linux, gcc 4.8 and higher should be
+sufficient.
+
+To build the thirdparty build dependencies, run:
+
+```
+./thirdparty/download_thirdparty.sh
+./thirdparty/build_thirdparty.sh
+```
+
+You can also run from the root of the C++ tree
+
+```
+source setup_build_env.sh
+```
+
+Arrow is configured to use the `thirdparty` directory by default for its build
+dependencies. To set up a custom toolchain see below.
+
+Simple debug build:
+
+    mkdir debug
+    cd debug
+    cmake ..
+    make
+    ctest
+
+Simple release build:
+
+    mkdir release
+    cd release
+    cmake .. -DCMAKE_BUILD_TYPE=Release
+    make
+    ctest
+
+### Third-party environment variables
+
+To set up your own specific build toolchain, here are the relevant environment
+variables
+
+* Googletest: `GTEST_HOME` (only required to build the unit tests)

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/build-support/asan_symbolize.py
----------------------------------------------------------------------
diff --git a/cpp/build-support/asan_symbolize.py b/cpp/build-support/asan_symbolize.py
new file mode 100755
index 0000000..839a198
--- /dev/null
+++ b/cpp/build-support/asan_symbolize.py
@@ -0,0 +1,360 @@
+#!/usr/bin/env python
+#===- lib/asan/scripts/asan_symbolize.py -----------------------------------===#
+#
+#                     The LLVM Compiler Infrastructure
+#
+# This file is distributed under the University of Illinois Open Source
+# License. See LICENSE.TXT for details.
+#
+#===------------------------------------------------------------------------===#
+import bisect
+import os
+import re
+import subprocess
+import sys
+
+llvm_symbolizer = None
+symbolizers = {}
+filetypes = {}
+vmaddrs = {}
+DEBUG = False
+
+
+# FIXME: merge the code that calls fix_filename().
+def fix_filename(file_name):
+  for path_to_cut in sys.argv[1:]:
+    file_name = re.sub('.*' + path_to_cut, '', file_name)
+  file_name = re.sub('.*asan_[a-z_]*.cc:[0-9]*', '_asan_rtl_', file_name)
+  file_name = re.sub('.*crtstuff.c:0', '???:0', file_name)
+  return file_name
+
+
+class Symbolizer(object):
+  def __init__(self):
+    pass
+
+  def symbolize(self, addr, binary, offset):
+    """Symbolize the given address (pair of binary and offset).
+
+    Overriden in subclasses.
+    Args:
+        addr: virtual address of an instruction.
+        binary: path to executable/shared object containing this instruction.
+        offset: instruction offset in the @binary.
+    Returns:
+        list of strings (one string for each inlined frame) describing
+        the code locations for this instruction (that is, function name, file
+        name, line and column numbers).
+    """
+    return None
+
+
+class LLVMSymbolizer(Symbolizer):
+  def __init__(self, symbolizer_path):
+    super(LLVMSymbolizer, self).__init__()
+    self.symbolizer_path = symbolizer_path
+    self.pipe = self.open_llvm_symbolizer()
+
+  def open_llvm_symbolizer(self):
+    if not os.path.exists(self.symbolizer_path):
+      return None
+    cmd = [self.symbolizer_path,
+           '--use-symbol-table=true',
+           '--demangle=false',
+           '--functions=true',
+           '--inlining=true']
+    if DEBUG:
+      print ' '.join(cmd)
+    return subprocess.Popen(cmd, stdin=subprocess.PIPE,
+                            stdout=subprocess.PIPE)
+
+  def symbolize(self, addr, binary, offset):
+    """Overrides Symbolizer.symbolize."""
+    if not self.pipe:
+      return None
+    result = []
+    try:
+      symbolizer_input = '%s %s' % (binary, offset)
+      if DEBUG:
+        print symbolizer_input
+      print >> self.pipe.stdin, symbolizer_input
+      while True:
+        function_name = self.pipe.stdout.readline().rstrip()
+        if not function_name:
+          break
+        file_name = self.pipe.stdout.readline().rstrip()
+        file_name = fix_filename(file_name)
+        if (not function_name.startswith('??') and
+            not file_name.startswith('??')):
+          # Append only valid frames.
+          result.append('%s in %s %s' % (addr, function_name,
+                                         file_name))
+    except Exception:
+      result = []
+    if not result:
+      result = None
+    return result
+
+
+def LLVMSymbolizerFactory(system):
+  symbolizer_path = os.getenv('LLVM_SYMBOLIZER_PATH')
+  if not symbolizer_path:
+    # Assume llvm-symbolizer is in PATH.
+    symbolizer_path = 'llvm-symbolizer'
+  return LLVMSymbolizer(symbolizer_path)
+
+
+class Addr2LineSymbolizer(Symbolizer):
+  def __init__(self, binary):
+    super(Addr2LineSymbolizer, self).__init__()
+    self.binary = binary
+    self.pipe = self.open_addr2line()
+
+  def open_addr2line(self):
+    cmd = ['addr2line', '-f', '-e', self.binary]
+    if DEBUG:
+      print ' '.join(cmd)
+    return subprocess.Popen(cmd,
+                            stdin=subprocess.PIPE, stdout=subprocess.PIPE)
+
+  def symbolize(self, addr, binary, offset):
+    """Overrides Symbolizer.symbolize."""
+    if self.binary != binary:
+      return None
+    try:
+      print >> self.pipe.stdin, offset
+      function_name = self.pipe.stdout.readline().rstrip()
+      file_name = self.pipe.stdout.readline().rstrip()
+    except Exception:
+      function_name = ''
+      file_name = ''
+    file_name = fix_filename(file_name)
+    return ['%s in %s %s' % (addr, function_name, file_name)]
+
+
+class DarwinSymbolizer(Symbolizer):
+  def __init__(self, addr, binary):
+    super(DarwinSymbolizer, self).__init__()
+    self.binary = binary
+    # Guess which arch we're running. 10 = len('0x') + 8 hex digits.
+    if len(addr) > 10:
+      self.arch = 'x86_64'
+    else:
+      self.arch = 'i386'
+    self.vmaddr = None
+    self.pipe = None
+
+  def write_addr_to_pipe(self, offset):
+    print >> self.pipe.stdin, '0x%x' % int(offset, 16)
+
+  def open_atos(self):
+    if DEBUG:
+      print 'atos -o %s -arch %s' % (self.binary, self.arch)
+    cmdline = ['atos', '-o', self.binary, '-arch', self.arch]
+    self.pipe = subprocess.Popen(cmdline,
+                                 stdin=subprocess.PIPE,
+                                 stdout=subprocess.PIPE,
+                                 stderr=subprocess.PIPE)
+
+  def symbolize(self, addr, binary, offset):
+    """Overrides Symbolizer.symbolize."""
+    if self.binary != binary:
+      return None
+    self.open_atos()
+    self.write_addr_to_pipe(offset)
+    self.pipe.stdin.close()
+    atos_line = self.pipe.stdout.readline().rstrip()
+    # A well-formed atos response looks like this:
+    #   foo(type1, type2) (in object.name) (filename.cc:80)
+    match = re.match('^(.*) \(in (.*)\) \((.*:\d*)\)$', atos_line)
+    if DEBUG:
+      print 'atos_line: ', atos_line
+    if match:
+      function_name = match.group(1)
+      function_name = re.sub('\(.*?\)', '', function_name)
+      file_name = fix_filename(match.group(3))
+      return ['%s in %s %s' % (addr, function_name, file_name)]
+    else:
+      return ['%s in %s' % (addr, atos_line)]
+
+
+# Chain several symbolizers so that if one symbolizer fails, we fall back
+# to the next symbolizer in chain.
+class ChainSymbolizer(Symbolizer):
+  def __init__(self, symbolizer_list):
+    super(ChainSymbolizer, self).__init__()
+    self.symbolizer_list = symbolizer_list
+
+  def symbolize(self, addr, binary, offset):
+    """Overrides Symbolizer.symbolize."""
+    for symbolizer in self.symbolizer_list:
+      if symbolizer:
+        result = symbolizer.symbolize(addr, binary, offset)
+        if result:
+          return result
+    return None
+
+  def append_symbolizer(self, symbolizer):
+    self.symbolizer_list.append(symbolizer)
+
+
+def BreakpadSymbolizerFactory(binary):
+  suffix = os.getenv('BREAKPAD_SUFFIX')
+  if suffix:
+    filename = binary + suffix
+    if os.access(filename, os.F_OK):
+      return BreakpadSymbolizer(filename)
+  return None
+
+
+def SystemSymbolizerFactory(system, addr, binary):
+  if system == 'Darwin':
+    return DarwinSymbolizer(addr, binary)
+  elif system == 'Linux':
+    return Addr2LineSymbolizer(binary)
+
+
+class BreakpadSymbolizer(Symbolizer):
+  def __init__(self, filename):
+    super(BreakpadSymbolizer, self).__init__()
+    self.filename = filename
+    lines = file(filename).readlines()
+    self.files = []
+    self.symbols = {}
+    self.address_list = []
+    self.addresses = {}
+    # MODULE mac x86_64 A7001116478B33F18FF9BEDE9F615F190 t
+    fragments = lines[0].rstrip().split()
+    self.arch = fragments[2]
+    self.debug_id = fragments[3]
+    self.binary = ' '.join(fragments[4:])
+    self.parse_lines(lines[1:])
+
+  def parse_lines(self, lines):
+    cur_function_addr = ''
+    for line in lines:
+      fragments = line.split()
+      if fragments[0] == 'FILE':
+        assert int(fragments[1]) == len(self.files)
+        self.files.append(' '.join(fragments[2:]))
+      elif fragments[0] == 'PUBLIC':
+        self.symbols[int(fragments[1], 16)] = ' '.join(fragments[3:])
+      elif fragments[0] in ['CFI', 'STACK']:
+        pass
+      elif fragments[0] == 'FUNC':
+        cur_function_addr = int(fragments[1], 16)
+        if not cur_function_addr in self.symbols.keys():
+          self.symbols[cur_function_addr] = ' '.join(fragments[4:])
+      else:
+        # Line starting with an address.
+        addr = int(fragments[0], 16)
+        self.address_list.append(addr)
+        # Tuple of symbol address, size, line, file number.
+        self.addresses[addr] = (cur_function_addr,
+                                int(fragments[1], 16),
+                                int(fragments[2]),
+                                int(fragments[3]))
+    self.address_list.sort()
+
+  def get_sym_file_line(self, addr):
+    key = None
+    if addr in self.addresses.keys():
+      key = addr
+    else:
+      index = bisect.bisect_left(self.address_list, addr)
+      if index == 0:
+        return None
+      else:
+        key = self.address_list[index - 1]
+    sym_id, size, line_no, file_no = self.addresses[key]
+    symbol = self.symbols[sym_id]
+    filename = self.files[file_no]
+    if addr < key + size:
+      return symbol, filename, line_no
+    else:
+      return None
+
+  def symbolize(self, addr, binary, offset):
+    if self.binary != binary:
+      return None
+    res = self.get_sym_file_line(int(offset, 16))
+    if res:
+      function_name, file_name, line_no = res
+      result = ['%s in %s %s:%d' % (
+          addr, function_name, file_name, line_no)]
+      print result
+      return result
+    else:
+      return None
+
+
+class SymbolizationLoop(object):
+  def __init__(self, binary_name_filter=None):
+    # Used by clients who may want to supply a different binary name.
+    # E.g. in Chrome several binaries may share a single .dSYM.
+    self.binary_name_filter = binary_name_filter
+    self.system = os.uname()[0]
+    if self.system in ['Linux', 'Darwin']:
+      self.llvm_symbolizer = LLVMSymbolizerFactory(self.system)
+    else:
+      raise Exception('Unknown system')
+
+  def symbolize_address(self, addr, binary, offset):
+    # Use the chain of symbolizers:
+    # Breakpad symbolizer -> LLVM symbolizer -> addr2line/atos
+    # (fall back to next symbolizer if the previous one fails).
+    if not binary in symbolizers:
+      symbolizers[binary] = ChainSymbolizer(
+          [BreakpadSymbolizerFactory(binary), self.llvm_symbolizer])
+    result = symbolizers[binary].symbolize(addr, binary, offset)
+    if result is None:
+      # Initialize system symbolizer only if other symbolizers failed.
+      symbolizers[binary].append_symbolizer(
+          SystemSymbolizerFactory(self.system, addr, binary))
+      result = symbolizers[binary].symbolize(addr, binary, offset)
+    # The system symbolizer must produce some result.
+    assert result
+    return result
+
+  def print_symbolized_lines(self, symbolized_lines):
+    if not symbolized_lines:
+      print self.current_line
+    else:
+      for symbolized_frame in symbolized_lines:
+        print '    #' + str(self.frame_no) + ' ' + symbolized_frame.rstrip()
+        self.frame_no += 1
+
+  def process_stdin(self):
+    self.frame_no = 0
+    sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)
+
+    while True:
+      line = sys.stdin.readline()
+      if not line: break
+      self.current_line = line.rstrip()
+      #0 0x7f6e35cf2e45  (/blah/foo.so+0x11fe45)
+      stack_trace_line_format = (
+          '^( *#([0-9]+) *)(0x[0-9a-f]+) *\((.*)\+(0x[0-9a-f]+)\)')
+      match = re.match(stack_trace_line_format, line)
+      if not match:
+        print self.current_line
+        continue
+      if DEBUG:
+        print line
+      _, frameno_str, addr, binary, offset = match.groups()
+      if frameno_str == '0':
+        # Assume that frame #0 is the first frame of new stack trace.
+        self.frame_no = 0
+      original_binary = binary
+      if self.binary_name_filter:
+        binary = self.binary_name_filter(binary)
+      symbolized_line = self.symbolize_address(addr, binary, offset)
+      if not symbolized_line:
+        if original_binary != binary:
+          symbolized_line = self.symbolize_address(addr, binary, offset)
+      self.print_symbolized_lines(symbolized_line)
+
+
+if __name__ == '__main__':
+  loop = SymbolizationLoop()
+  loop.process_stdin()

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/build-support/bootstrap_toolchain.py
----------------------------------------------------------------------
diff --git a/cpp/build-support/bootstrap_toolchain.py b/cpp/build-support/bootstrap_toolchain.py
new file mode 100755
index 0000000..128be78
--- /dev/null
+++ b/cpp/build-support/bootstrap_toolchain.py
@@ -0,0 +1,114 @@
+#!/usr/bin/env python
+# Copyright (c) 2015, Cloudera, inc.
+# Confidential Cloudera Information: Covered by NDA.
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Bootstrapping the native toolchain with prebuilt binaries
+#
+# The purpose of this script is to download prebuilt artifacts of the native toolchain to
+# satisfy the third-party dependencies. The script checks for the presence of
+# NATIVE_TOOLCHAIN. NATIVE_TOOLCHAIN indicates the location where the prebuilt artifacts
+# should be extracted to.
+#
+# The script is called as follows without any additional parameters:
+#
+#     python bootstrap_toolchain.py
+import sh
+import os
+import sys
+import re
+
+HOST = "https://native-toolchain.s3.amazonaws.com/build"
+
+OS_MAPPING = {
+  "centos6" : "ec2-package-centos-6",
+  "centos5" : "ec2-package-centos-5",
+  "centos7" : "ec2-package-centos-7",
+  "debian6" : "ec2-package-debian-6",
+  "debian7" : "ec2-package-debian-7",
+  "suselinux11": "ec2-package-sles-11",
+  "ubuntu12.04" : "ec2-package-ubuntu-12-04",
+  "ubuntu14.04" : "ec2-package-ubuntu-14-04"
+}
+
+def get_release_label():
+  """Gets the right package label from the OS version"""
+  release = "".join(map(lambda x: x.lower(), sh.lsb_release("-irs").split()))
+  for k, v in OS_MAPPING.iteritems():
+    if re.search(k, release):
+      return v
+
+  print("Pre-built toolchain archives not available for your platform.")
+  print("Clone and build native toolchain from source using this repository:")
+  print("    https://github.com/cloudera/native-toolchain")
+  raise Exception("Could not find package label for OS version: {0}.".format(release))
+
+def download_package(destination, product, version, compiler):
+  label = get_release_label()
+  file_name = "{0}-{1}-{2}-{3}.tar.gz".format(product, version, compiler, label)
+  url_path="/{0}/{1}-{2}/{0}-{1}-{2}-{3}.tar.gz".format(product, version, compiler, label)
+  download_path = HOST + url_path
+
+  print "URL {0}".format(download_path)
+  print "Downloading {0} to {1}".format(file_name, destination)
+  # --no-clobber avoids downloading the file if a file with the name already exists
+  sh.wget(download_path, directory_prefix=destination, no_clobber=True)
+  print "Extracting {0}".format(file_name)
+  sh.tar(z=True, x=True, f=os.path.join(destination, file_name), directory=destination)
+  sh.rm(os.path.join(destination, file_name))
+
+
+def bootstrap(packages):
+  """Validates the presence of $NATIVE_TOOLCHAIN in the environment. By checking
+  $NATIVE_TOOLCHAIN is present, we assume that {LIB}_VERSION will be present as well. Will
+  create the directory specified by $NATIVE_TOOLCHAIN if it does not yet exist. Each of
+  the packages specified in `packages` is downloaded and extracted into $NATIVE_TOOLCHAIN.
+  """
+  # Create the destination directory if necessary
+  destination = os.getenv("NATIVE_TOOLCHAIN")
+  if not destination:
+    print("Build environment not set up correctly, make sure "
+          "$NATIVE_TOOLCHAIN is present.")
+    sys.exit(1)
+
+  if not os.path.exists(destination):
+    os.makedirs(destination)
+
+  # Detect the compiler
+  if "SYSTEM_GCC" in os.environ:
+    compiler = "gcc-system"
+  else:
+    compiler = "gcc-{0}".format(os.environ["GCC_VERSION"])
+
+  for p in packages:
+    pkg_name, pkg_version = unpack_name_and_version(p)
+    download_package(destination, pkg_name, pkg_version, compiler)
+
+def unpack_name_and_version(package):
+  """A package definition is either a string where the version is fetched from the
+  environment or a tuple where the package name and the package version are fully
+  specified.
+  """
+  if isinstance(package, basestring):
+    env_var = "{0}_VERSION".format(package).replace("-", "_").upper()
+    try:
+      return package, os.environ[env_var]
+    except KeyError:
+      raise Exception("Could not find version for {0} in environment var {1}".format(
+        package, env_var))
+  return package[0], package[1]
+
+if __name__ == "__main__":
+  packages = [("gcc","4.9.2"), ("gflags", "2.0"), ("glog", "0.3.3-p1"),
+              ("gperftools", "2.3"), ("libunwind", "1.1"), ("googletest", "20151222")]
+  bootstrap(packages)


[15/17] arrow git commit: ARROW-4: This provides an partial C++11 implementation of the Apache Arrow data structures along with a cmake-based build system. The codebase generally follows Google C++ style guide, but more cleaning to be more conforming is

Posted by ja...@apache.org.
http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/build-support/run-test.sh
----------------------------------------------------------------------
diff --git a/cpp/build-support/run-test.sh b/cpp/build-support/run-test.sh
new file mode 100755
index 0000000..b203913
--- /dev/null
+++ b/cpp/build-support/run-test.sh
@@ -0,0 +1,195 @@
+#!/bin/bash
+# Copyright 2014 Cloudera, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+# Script which wraps running a test and redirects its output to a
+# test log directory.
+#
+# If KUDU_COMPRESS_TEST_OUTPUT is non-empty, then the logs will be
+# gzip-compressed while they are written.
+#
+# If KUDU_FLAKY_TEST_ATTEMPTS is non-zero, and the test being run matches
+# one of the lines in the file KUDU_FLAKY_TEST_LIST, then the test will
+# be retried on failure up to the specified number of times. This can be
+# used in the gerrit workflow to prevent annoying false -1s caused by
+# tests that are known to be flaky in master.
+#
+# If KUDU_REPORT_TEST_RESULTS is non-zero, then tests are reported to the
+# central test server.
+
+ROOT=$(cd $(dirname $BASH_SOURCE)/..; pwd)
+
+TEST_LOGDIR=$ROOT/build/test-logs
+mkdir -p $TEST_LOGDIR
+
+TEST_DEBUGDIR=$ROOT/build/test-debug
+mkdir -p $TEST_DEBUGDIR
+
+TEST_DIRNAME=$(cd $(dirname $1); pwd)
+TEST_FILENAME=$(basename $1)
+shift
+TEST_EXECUTABLE="$TEST_DIRNAME/$TEST_FILENAME"
+TEST_NAME=$(echo $TEST_FILENAME | perl -pe 's/\..+?$//') # Remove path and extension (if any).
+
+# We run each test in its own subdir to avoid core file related races.
+TEST_WORKDIR=$ROOT/build/test-work/$TEST_NAME
+mkdir -p $TEST_WORKDIR
+pushd $TEST_WORKDIR >/dev/null || exit 1
+rm -f *
+
+set -o pipefail
+
+LOGFILE=$TEST_LOGDIR/$TEST_NAME.txt
+XMLFILE=$TEST_LOGDIR/$TEST_NAME.xml
+
+TEST_EXECUTION_ATTEMPTS=1
+
+# Remove both the uncompressed output, so the developer doesn't accidentally get confused
+# and read output from a prior test run.
+rm -f $LOGFILE $LOGFILE.gz
+
+pipe_cmd=cat
+
+# Configure TSAN (ignored if this isn't a TSAN build).
+#
+# Deadlock detection (new in clang 3.5) is disabled because:
+# 1. The clang 3.5 deadlock detector crashes in some unit tests. It
+#    needs compiler-rt commits c4c3dfd, 9a8efe3, and possibly others.
+# 2. Many unit tests report lock-order-inversion warnings; they should be
+#    fixed before reenabling the detector.
+TSAN_OPTIONS="$TSAN_OPTIONS detect_deadlocks=0"
+TSAN_OPTIONS="$TSAN_OPTIONS suppressions=$ROOT/build-support/tsan-suppressions.txt"
+TSAN_OPTIONS="$TSAN_OPTIONS history_size=7"
+export TSAN_OPTIONS
+
+# Enable leak detection even under LLVM 3.4, where it was disabled by default.
+# This flag only takes effect when running an ASAN build.
+ASAN_OPTIONS="$ASAN_OPTIONS detect_leaks=1"
+export ASAN_OPTIONS
+
+# Set up suppressions for LeakSanitizer
+LSAN_OPTIONS="$LSAN_OPTIONS suppressions=$ROOT/build-support/lsan-suppressions.txt"
+export LSAN_OPTIONS
+
+# Suppressions require symbolization. We'll default to using the symbolizer in
+# thirdparty.
+if [ -z "$ASAN_SYMBOLIZER_PATH" ]; then
+  export ASAN_SYMBOLIZER_PATH=$(find $NATIVE_TOOLCHAIN/llvm-3.7.0/bin -name llvm-symbolizer)
+fi
+
+# Allow for collecting core dumps.
+ARROW_TEST_ULIMIT_CORE=${ARROW_TEST_ULIMIT_CORE:-0}
+ulimit -c $ARROW_TEST_ULIMIT_CORE
+
+# Run the actual test.
+for ATTEMPT_NUMBER in $(seq 1 $TEST_EXECUTION_ATTEMPTS) ; do
+  if [ $ATTEMPT_NUMBER -lt $TEST_EXECUTION_ATTEMPTS ]; then
+    # If the test fails, the test output may or may not be left behind,
+    # depending on whether the test cleaned up or exited immediately. Either
+    # way we need to clean it up. We do this by comparing the data directory
+    # contents before and after the test runs, and deleting anything new.
+    #
+    # The comm program requires that its two inputs be sorted.
+    TEST_TMPDIR_BEFORE=$(find $TEST_TMPDIR -maxdepth 1 -type d | sort)
+  fi
+
+  # gtest won't overwrite old junit test files, resulting in a build failure
+  # even when retries are successful.
+  rm -f $XMLFILE
+
+  echo "Running $TEST_NAME, redirecting output into $LOGFILE" \
+    "(attempt ${ATTEMPT_NUMBER}/$TEST_EXECUTION_ATTEMPTS)"
+  $TEST_EXECUTABLE "$@" 2>&1 \
+    | $ROOT/build-support/asan_symbolize.py \
+    | c++filt \
+    | $ROOT/build-support/stacktrace_addr2line.pl $TEST_EXECUTABLE \
+    | $pipe_cmd > $LOGFILE
+  STATUS=$?
+
+  # TSAN doesn't always exit with a non-zero exit code due to a bug:
+  # mutex errors don't get reported through the normal error reporting infrastructure.
+  # So we make sure to detect this and exit 1.
+  #
+  # Additionally, certain types of failures won't show up in the standard JUnit
+  # XML output from gtest. We assume that gtest knows better than us and our
+  # regexes in most cases, but for certain errors we delete the resulting xml
+  # file and let our own post-processing step regenerate it.
+  export GREP=$(which egrep)
+  if zgrep --silent "ThreadSanitizer|Leak check.*detected leaks" $LOGFILE ; then
+    echo ThreadSanitizer or leak check failures in $LOGFILE
+    STATUS=1
+    rm -f $XMLFILE
+  fi
+
+  if [ $ATTEMPT_NUMBER -lt $TEST_EXECUTION_ATTEMPTS ]; then
+    # Now delete any new test output.
+    TEST_TMPDIR_AFTER=$(find $TEST_TMPDIR -maxdepth 1 -type d | sort)
+    DIFF=$(comm -13 <(echo "$TEST_TMPDIR_BEFORE") \
+                    <(echo "$TEST_TMPDIR_AFTER"))
+    for DIR in $DIFF; do
+      # Multiple tests may be running concurrently. To avoid deleting the
+      # wrong directories, constrain to only directories beginning with the
+      # test name.
+      #
+      # This may delete old test directories belonging to this test, but
+      # that's not typically a concern when rerunning flaky tests.
+      if [[ $DIR =~ ^$TEST_TMPDIR/$TEST_NAME ]]; then
+        echo Deleting leftover flaky test directory "$DIR"
+        rm -Rf "$DIR"
+      fi
+    done
+  fi
+
+  if [ "$STATUS" -eq "0" ]; then
+    break
+  elif [ "$ATTEMPT_NUMBER" -lt "$TEST_EXECUTION_ATTEMPTS" ]; then
+    echo Test failed attempt number $ATTEMPT_NUMBER
+    echo Will retry...
+  fi
+done
+
+# If we have a LeakSanitizer report, and XML reporting is configured, add a new test
+# case result to the XML file for the leak report. Otherwise Jenkins won't show
+# us which tests had LSAN errors.
+if zgrep --silent "ERROR: LeakSanitizer: detected memory leaks" $LOGFILE ; then
+    echo Test had memory leaks. Editing XML
+    perl -p -i -e '
+    if (m#</testsuite>#) {
+      print "<testcase name=\"LeakSanitizer\" status=\"run\" classname=\"LSAN\">\n";
+      print "  <failure message=\"LeakSanitizer failed\" type=\"\">\n";
+      print "    See txt log file for details\n";
+      print "  </failure>\n";
+      print "</testcase>\n";
+    }' $XMLFILE
+fi
+
+# Capture and compress core file and binary.
+COREFILES=$(ls | grep ^core)
+if [ -n "$COREFILES" ]; then
+  echo Found core dump. Saving executable and core files.
+  gzip < $TEST_EXECUTABLE > "$TEST_DEBUGDIR/$TEST_NAME.gz" || exit $?
+  for COREFILE in $COREFILES; do
+    gzip < $COREFILE > "$TEST_DEBUGDIR/$TEST_NAME.$COREFILE.gz" || exit $?
+  done
+  # Pull in any .so files as well.
+  for LIB in $(ldd $TEST_EXECUTABLE | grep $ROOT | awk '{print $3}'); do
+    LIB_NAME=$(basename $LIB)
+    gzip < $LIB > "$TEST_DEBUGDIR/$LIB_NAME.gz" || exit $?
+  done
+fi
+
+popd
+rm -Rf $TEST_WORKDIR
+
+exit $STATUS

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/build-support/stacktrace_addr2line.pl
----------------------------------------------------------------------
diff --git a/cpp/build-support/stacktrace_addr2line.pl b/cpp/build-support/stacktrace_addr2line.pl
new file mode 100755
index 0000000..7664bab
--- /dev/null
+++ b/cpp/build-support/stacktrace_addr2line.pl
@@ -0,0 +1,92 @@
+#!/usr/bin/perl
+# Copyright 2014 Cloudera, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#######################################################################
+# This script will convert a stack trace with addresses:
+#     @           0x5fb015 kudu::master::Master::Init()
+#     @           0x5c2d38 kudu::master::MiniMaster::StartOnPorts()
+#     @           0x5c31fa kudu::master::MiniMaster::Start()
+#     @           0x58270a kudu::MiniCluster::Start()
+#     @           0x57dc71 kudu::CreateTableStressTest::SetUp()
+# To one with line numbers:
+#     @           0x5fb015 kudu::master::Master::Init() at /home/mpercy/src/kudu/src/master/master.cc:54
+#     @           0x5c2d38 kudu::master::MiniMaster::StartOnPorts() at /home/mpercy/src/kudu/src/master/mini_master.cc:52
+#     @           0x5c31fa kudu::master::MiniMaster::Start() at /home/mpercy/src/kudu/src/master/mini_master.cc:33
+#     @           0x58270a kudu::MiniCluster::Start() at /home/mpercy/src/kudu/src/integration-tests/mini_cluster.cc:48
+#     @           0x57dc71 kudu::CreateTableStressTest::SetUp() at /home/mpercy/src/kudu/src/integration-tests/create-table-stress-test.cc:61
+#
+# If the script detects that the output is not symbolized, it will also attempt
+# to determine the function names, i.e. it will convert:
+#     @           0x5fb015
+#     @           0x5c2d38
+#     @           0x5c31fa
+# To:
+#     @           0x5fb015 kudu::master::Master::Init() at /home/mpercy/src/kudu/src/master/master.cc:54
+#     @           0x5c2d38 kudu::master::MiniMaster::StartOnPorts() at /home/mpercy/src/kudu/src/master/mini_master.cc:52
+#     @           0x5c31fa kudu::master::MiniMaster::Start() at /home/mpercy/src/kudu/src/master/mini_master.cc:33
+#######################################################################
+use strict;
+use warnings;
+
+if (!@ARGV) {
+  die <<EOF
+Usage: $0 executable [stack-trace-file]
+
+This script will read addresses from a file containing stack traces and
+will convert the addresses that conform to the pattern " @ 0x123456" to line
+numbers by calling addr2line on the provided executable.
+If no stack-trace-file is specified, it will take input from stdin.
+EOF
+}
+
+# el6 and other older systems don't support the -p flag,
+# so we do our own "pretty" parsing.
+sub parse_addr2line_output($$) {
+  defined(my $output = shift) or die;
+  defined(my $lookup_func_name = shift) or die;
+  my @lines = grep { $_ ne '' } split("\n", $output);
+  my $pretty_str = '';
+  if ($lookup_func_name) {
+    $pretty_str .= ' ' . $lines[0];
+  }
+  $pretty_str .= ' at ' . $lines[1];
+  return $pretty_str;
+}
+
+my $binary = shift @ARGV;
+if (! -x $binary || ! -r $binary) {
+  die "Error: Cannot access executable ($binary)";
+}
+
+# Cache lookups to speed processing of files with repeated trace addresses.
+my %addr2line_map = ();
+
+# Disable stdout buffering
+$| = 1;
+
+# Reading from <ARGV> is magical in Perl.
+while (defined(my $input = <ARGV>)) {
+  if ($input =~ /^\s+\@\s+(0x[[:xdigit:]]{6,})(?:\s+(\S+))?/) {
+    my $addr = $1;
+    my $lookup_func_name = (!defined $2);
+    if (!exists($addr2line_map{$addr})) {
+      $addr2line_map{$addr} = `addr2line -ifC -e $binary $addr`;
+    }
+    chomp $input;
+    $input .= parse_addr2line_output($addr2line_map{$addr}, $lookup_func_name) . "\n";
+  }
+  print $input;
+}
+
+exit 0;

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/cmake_modules/CompilerInfo.cmake
----------------------------------------------------------------------
diff --git a/cpp/cmake_modules/CompilerInfo.cmake b/cpp/cmake_modules/CompilerInfo.cmake
new file mode 100644
index 0000000..0786068
--- /dev/null
+++ b/cpp/cmake_modules/CompilerInfo.cmake
@@ -0,0 +1,46 @@
+# Copyright 2013 Cloudera, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+# Sets COMPILER_FAMILY to 'clang' or 'gcc'
+# Sets COMPILER_VERSION to the version
+execute_process(COMMAND "${CMAKE_CXX_COMPILER}" -v
+                ERROR_VARIABLE COMPILER_VERSION_FULL)
+message(INFO " ${COMPILER_VERSION_FULL}")
+
+# clang on Linux and Mac OS X before 10.9
+if("${COMPILER_VERSION_FULL}" MATCHES ".*clang version.*")
+  set(COMPILER_FAMILY "clang")
+  string(REGEX REPLACE ".*clang version ([0-9]+\\.[0-9]+).*" "\\1"
+    COMPILER_VERSION "${COMPILER_VERSION_FULL}")
+# clang on Mac OS X 10.9 and later
+elseif("${COMPILER_VERSION_FULL}" MATCHES ".*based on LLVM.*")
+  set(COMPILER_FAMILY "clang")
+  string(REGEX REPLACE ".*based on LLVM ([0-9]+\\.[0.9]+).*" "\\1"
+    COMPILER_VERSION "${COMPILER_VERSION_FULL}")
+
+# clang on Mac OS X, XCode 7. No version replacement is done
+# because Apple no longer advertises the upstream LLVM version.
+elseif("${COMPILER_VERSION_FULL}" MATCHES "clang-700\\..*")
+  set(COMPILER_FAMILY "clang")
+
+# gcc
+elseif("${COMPILER_VERSION_FULL}" MATCHES ".*gcc version.*")
+  set(COMPILER_FAMILY "gcc")
+  string(REGEX REPLACE ".*gcc version ([0-9\\.]+).*" "\\1"
+    COMPILER_VERSION "${COMPILER_VERSION_FULL}")
+else()
+  message(FATAL_ERROR "Unknown compiler. Version info:\n${COMPILER_VERSION_FULL}")
+endif()
+message("Selected compiler ${COMPILER_FAMILY} ${COMPILER_VERSION}")
+

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/cmake_modules/FindGPerf.cmake
----------------------------------------------------------------------
diff --git a/cpp/cmake_modules/FindGPerf.cmake b/cpp/cmake_modules/FindGPerf.cmake
new file mode 100644
index 0000000..e831079
--- /dev/null
+++ b/cpp/cmake_modules/FindGPerf.cmake
@@ -0,0 +1,69 @@
+# -*- cmake -*-
+
+# - Find Google perftools
+# Find the Google perftools includes and libraries
+# This module defines
+#  GOOGLE_PERFTOOLS_INCLUDE_DIR, where to find heap-profiler.h, etc.
+#  GOOGLE_PERFTOOLS_FOUND, If false, do not try to use Google perftools.
+# also defined for general use are
+#  TCMALLOC_LIBS, where to find the tcmalloc libraries.
+#  TCMALLOC_STATIC_LIB, path to libtcmalloc.a.
+#  TCMALLOC_SHARED_LIB, path to libtcmalloc's shared library
+#  PROFILER_LIBS, where to find the profiler libraries.
+#  PROFILER_STATIC_LIB, path to libprofiler.a.
+#  PROFILER_SHARED_LIB, path to libprofiler's shared library
+
+FIND_PATH(GOOGLE_PERFTOOLS_INCLUDE_DIR google/heap-profiler.h
+  $ENV{NATIVE_TOOLCHAIN}/gperftools-$ENV{GPERFTOOLS_VERSION}/include
+  NO_DEFAULT_PATH
+)
+
+SET(GPERF_LIB_SEARCH $ENV{NATIVE_TOOLCHAIN}/gperftools-$ENV{GPERFTOOLS_VERSION}/lib)
+
+FIND_LIBRARY(TCMALLOC_LIB_PATH
+  NAMES libtcmalloc.a
+  PATHS ${GPERF_LIB_SEARCH}
+  NO_DEFAULT_PATH
+)
+
+IF (TCMALLOC_LIB_PATH AND GOOGLE_PERFTOOLS_INCLUDE_DIR)
+    SET(TCMALLOC_LIBS ${GPERF_LIB_SEARCH})
+    SET(TCMALLOC_LIB_NAME libtcmalloc)
+    SET(TCMALLOC_STATIC_LIB ${GPERF_LIB_SEARCH}/${TCMALLOC_LIB_NAME}.a)
+    SET(TCMALLOC_SHARED_LIB ${TCMALLOC_LIBS}/${TCMALLOC_LIB_NAME}${CMAKE_SHARED_LIBRARY_SUFFIX})
+    SET(GOOGLE_PERFTOOLS_FOUND "YES")
+ELSE (TCMALLOC_LIB_PATH AND GOOGLE_PERFTOOLS_INCLUDE_DIR)
+  SET(GOOGLE_PERFTOOLS_FOUND "NO")
+ENDIF (TCMALLOC_LIB_PATH AND GOOGLE_PERFTOOLS_INCLUDE_DIR)
+
+FIND_LIBRARY(PROFILER_LIB_PATH
+  NAMES libprofiler.a
+  PATHS ${GPERF_LIB_SEARCH}
+)
+
+IF (PROFILER_LIB_PATH AND GOOGLE_PERFTOOLS_INCLUDE_DIR)
+  SET(PROFILER_LIBS ${GPERF_LIB_SEARCH})
+  SET(PROFILER_LIB_NAME libprofiler)
+  SET(PROFILER_STATIC_LIB ${GPERF_LIB_SEARCH}/${PROFILER_LIB_NAME}.a)
+  SET(PROFILER_SHARED_LIB ${PROFILER_LIBS}/${PROFILER_LIB_NAME}${CMAKE_SHARED_LIBRARY_SUFFIX})
+ENDIF (PROFILER_LIB_PATH AND GOOGLE_PERFTOOLS_INCLUDE_DIR)
+
+IF (GOOGLE_PERFTOOLS_FOUND)
+  IF (NOT GPerf_FIND_QUIETLY)
+    MESSAGE(STATUS "Found the Google perftools library: ${TCMALLOC_LIBS}")
+  ENDIF (NOT GPerf_FIND_QUIETLY)
+ELSE (GOOGLE_PERFTOOLS_FOUND)
+  IF (GPerf_FIND_REQUIRED)
+    MESSAGE(FATAL_ERROR "Could not find the Google perftools library")
+  ENDIF (GPerf_FIND_REQUIRED)
+ENDIF (GOOGLE_PERFTOOLS_FOUND)
+
+MARK_AS_ADVANCED(
+  TCMALLOC_LIBS
+  TCMALLOC_STATIC_LIB
+  TCMALLOC_SHARED_LIB
+  PROFILER_LIBS
+  PROFILER_STATIC_LIB
+  PROFILER_SHARED_LIB
+  GOOGLE_PERFTOOLS_INCLUDE_DIR
+)

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/cmake_modules/FindGTest.cmake
----------------------------------------------------------------------
diff --git a/cpp/cmake_modules/FindGTest.cmake b/cpp/cmake_modules/FindGTest.cmake
new file mode 100644
index 0000000..e47faf0
--- /dev/null
+++ b/cpp/cmake_modules/FindGTest.cmake
@@ -0,0 +1,91 @@
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+# Tries to find GTest headers and libraries.
+#
+# Usage of this module as follows:
+#
+#  find_package(GTest)
+#
+# Variables used by this module, they can change the default behaviour and need
+# to be set before calling find_package:
+#
+#  GTest_HOME - When set, this path is inspected instead of standard library
+#                locations as the root of the GTest installation.
+#                The environment variable GTEST_HOME overrides this veriable.
+#
+# This module defines
+#  GTEST_INCLUDE_DIR, directory containing headers
+#  GTEST_LIBS, directory containing gtest libraries
+#  GTEST_STATIC_LIB, path to libgtest.a
+#  GTEST_SHARED_LIB, path to libgtest's shared library
+#  GTEST_FOUND, whether gtest has been found
+
+if( NOT "$ENV{GTEST_HOME}" STREQUAL "")
+    file( TO_CMAKE_PATH "$ENV{GTEST_HOME}" _native_path )
+    list( APPEND _gtest_roots ${_native_path} )
+elseif ( GTest_HOME )
+    list( APPEND _gtest_roots ${GTest_HOME} )
+endif()
+
+# Try the parameterized roots, if they exist
+if ( _gtest_roots )
+    find_path( GTEST_INCLUDE_DIR NAMES gtest/gtest.h
+        PATHS ${_gtest_roots} NO_DEFAULT_PATH
+        PATH_SUFFIXES "include" )
+    find_library( GTEST_LIBRARIES NAMES gtest
+        PATHS ${_gtest_roots} NO_DEFAULT_PATH
+        PATH_SUFFIXES "lib" )
+else ()
+    find_path( GTEST_INCLUDE_DIR NAMES gtest/gtest.h )
+    find_library( GTEST_LIBRARIES NAMES gtest )
+endif ()
+
+
+if (GTEST_INCLUDE_DIR AND GTEST_LIBRARIES)
+  set(GTEST_FOUND TRUE)
+  get_filename_component( GTEST_LIBS ${GTEST_LIBRARIES} DIRECTORY )
+  set(GTEST_LIB_NAME libgtest)
+  set(GTEST_STATIC_LIB ${GTEST_LIBS}/${GTEST_LIB_NAME}.a)
+  set(GTEST_SHARED_LIB ${GTEST_LIBS}/${GTEST_LIB_NAME}${CMAKE_SHARED_LIBRARY_SUFFIX})
+else ()
+  set(GTEST_FOUND FALSE)
+endif ()
+
+if (GTEST_FOUND)
+  if (NOT GTest_FIND_QUIETLY)
+    message(STATUS "Found the GTest library: ${GTEST_LIBRARIES}")
+  endif ()
+else ()
+  if (NOT GTest_FIND_QUIETLY)
+    set(GTEST_ERR_MSG "Could not find the GTest library. Looked in ")
+    if ( _gtest_roots )
+      set(GTEST_ERR_MSG "${GTEST_ERR_MSG} in ${_gtest_roots}.")
+    else ()
+      set(GTEST_ERR_MSG "${GTEST_ERR_MSG} system search paths.")
+    endif ()
+    if (GTest_FIND_REQUIRED)
+      message(FATAL_ERROR "${GTEST_ERR_MSG}")
+    else (GTest_FIND_REQUIRED)
+      message(STATUS "${GTEST_ERR_MSG}")
+    endif (GTest_FIND_REQUIRED)
+  endif ()
+endif ()
+
+mark_as_advanced(
+  GTEST_INCLUDE_DIR
+  GTEST_LIBS
+  GTEST_LIBRARIES
+  GTEST_STATIC_LIB
+  GTEST_SHARED_LIB
+)

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/cmake_modules/FindParquet.cmake
----------------------------------------------------------------------
diff --git a/cpp/cmake_modules/FindParquet.cmake b/cpp/cmake_modules/FindParquet.cmake
new file mode 100644
index 0000000..76c2d1d
--- /dev/null
+++ b/cpp/cmake_modules/FindParquet.cmake
@@ -0,0 +1,80 @@
+# Copyright 2012 Cloudera Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# - Find PARQUET (parquet/parquet.h, libparquet.a, libparquet.so)
+# This module defines
+#  PARQUET_INCLUDE_DIR, directory containing headers
+#  PARQUET_LIBS, directory containing parquet libraries
+#  PARQUET_STATIC_LIB, path to libparquet.a
+#  PARQUET_SHARED_LIB, path to libparquet's shared library
+#  PARQUET_FOUND, whether parquet has been found
+
+if( NOT "$ENV{PARQUET_HOME}" STREQUAL "")
+    file( TO_CMAKE_PATH "$ENV{PARQUET_HOME}" _native_path )
+    list( APPEND _parquet_roots ${_native_path} )
+elseif ( Parquet_HOME )
+    list( APPEND _parquet_roots ${Parquet_HOME} )
+endif()
+
+# Try the parameterized roots, if they exist
+if ( _parquet_roots )
+    find_path( PARQUET_INCLUDE_DIR NAMES parquet/parquet.h
+        PATHS ${_parquet_roots} NO_DEFAULT_PATH
+        PATH_SUFFIXES "include" )
+    find_library( PARQUET_LIBRARIES NAMES parquet
+        PATHS ${_parquet_roots} NO_DEFAULT_PATH
+        PATH_SUFFIXES "lib" )
+else ()
+    find_path( PARQUET_INCLUDE_DIR NAMES parquet/parquet.h )
+    find_library( PARQUET_LIBRARIES NAMES parquet )
+endif ()
+
+
+if (PARQUET_INCLUDE_DIR AND PARQUET_LIBRARIES)
+  set(PARQUET_FOUND TRUE)
+  get_filename_component( PARQUET_LIBS ${PARQUET_LIBRARIES} DIRECTORY )
+  set(PARQUET_LIB_NAME libparquet)
+  set(PARQUET_STATIC_LIB ${PARQUET_LIBS}/${PARQUET_LIB_NAME}.a)
+  set(PARQUET_SHARED_LIB ${PARQUET_LIBS}/${PARQUET_LIB_NAME}${CMAKE_SHARED_LIBRARY_SUFFIX})
+else ()
+  set(PARQUET_FOUND FALSE)
+endif ()
+
+if (PARQUET_FOUND)
+  if (NOT Parquet_FIND_QUIETLY)
+    message(STATUS "Found the Parquet library: ${PARQUET_LIBRARIES}")
+  endif ()
+else ()
+  if (NOT Parquet_FIND_QUIETLY)
+    set(PARQUET_ERR_MSG "Could not find the Parquet library. Looked in ")
+    if ( _parquet_roots )
+      set(PARQUET_ERR_MSG "${PARQUET_ERR_MSG} in ${_parquet_roots}.")
+    else ()
+      set(PARQUET_ERR_MSG "${PARQUET_ERR_MSG} system search paths.")
+    endif ()
+    if (Parquet_FIND_REQUIRED)
+      message(FATAL_ERROR "${PARQUET_ERR_MSG}")
+    else (Parquet_FIND_REQUIRED)
+      message(STATUS "${PARQUET_ERR_MSG}")
+    endif (Parquet_FIND_REQUIRED)
+  endif ()
+endif ()
+
+mark_as_advanced(
+  PARQUET_INCLUDE_DIR
+  PARQUET_LIBS
+  PARQUET_LIBRARIES
+  PARQUET_STATIC_LIB
+  PARQUET_SHARED_LIB
+)

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/cmake_modules/san-config.cmake
----------------------------------------------------------------------
diff --git a/cpp/cmake_modules/san-config.cmake b/cpp/cmake_modules/san-config.cmake
new file mode 100644
index 0000000..b847c96
--- /dev/null
+++ b/cpp/cmake_modules/san-config.cmake
@@ -0,0 +1,92 @@
+# Clang does not support using ASAN and TSAN simultaneously.
+if ("${ARROW_USE_ASAN}" AND "${ARROW_USE_TSAN}")
+  message(SEND_ERROR "Can only enable one of ASAN or TSAN at a time")
+endif()
+
+# Flag to enable clang address sanitizer
+# This will only build if clang or a recent enough gcc is the chosen compiler
+if (${ARROW_USE_ASAN})
+  if(NOT (("${COMPILER_FAMILY}" STREQUAL "clang") OR
+          ("${COMPILER_FAMILY}" STREQUAL "gcc" AND "${COMPILER_VERSION}" VERSION_GREATER "4.8")))
+    message(SEND_ERROR "Cannot use ASAN without clang or gcc >= 4.8")
+  endif()
+
+  # If UBSAN is also enabled, and we're on clang < 3.5, ensure static linking is
+  # enabled. Otherwise, we run into https://llvm.org/bugs/show_bug.cgi?id=18211
+  if("${ARROW_USE_UBSAN}" AND
+      "${COMPILER_FAMILY}" STREQUAL "clang" AND
+      "${COMPILER_VERSION}" VERSION_LESS "3.5")
+    if("${ARROW_LINK}" STREQUAL "a")
+      message("Using static linking for ASAN+UBSAN build")
+      set(ARROW_LINK "s")
+    elseif("${ARROW_LINK}" STREQUAL "d")
+      message(SEND_ERROR "Cannot use dynamic linking when ASAN and UBSAN are both enabled")
+    endif()
+  endif()
+  set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fsanitize=address -DADDRESS_SANITIZER")
+endif()
+
+
+# Flag to enable clang undefined behavior sanitizer
+# We explicitly don't enable all of the sanitizer flags:
+# - disable 'vptr' because it currently crashes somewhere in boost::intrusive::list code
+# - disable 'alignment' because unaligned access is really OK on Nehalem and we do it
+#   all over the place.
+if (${ARROW_USE_UBSAN})
+  if(NOT (("${COMPILER_FAMILY}" STREQUAL "clang") OR
+          ("${COMPILER_FAMILY}" STREQUAL "gcc" AND "${COMPILER_VERSION}" VERSION_GREATER "4.9")))
+    message(SEND_ERROR "Cannot use UBSAN without clang or gcc >= 4.9")
+  endif()
+  set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fsanitize=undefined -fno-sanitize=alignment,vptr -fno-sanitize-recover")
+endif ()
+
+# Flag to enable thread sanitizer (clang or gcc 4.8)
+if (${ARROW_USE_TSAN})
+  if(NOT (("${COMPILER_FAMILY}" STREQUAL "clang") OR
+          ("${COMPILER_FAMILY}" STREQUAL "gcc" AND "${COMPILER_VERSION}" VERSION_GREATER "4.8")))
+    message(SEND_ERROR "Cannot use TSAN without clang or gcc >= 4.8")
+  endif()
+
+  add_definitions("-fsanitize=thread")
+
+  # Enables dynamic_annotations.h to actually generate code
+  add_definitions("-DDYNAMIC_ANNOTATIONS_ENABLED")
+
+  # changes atomicops to use the tsan implementations
+  add_definitions("-DTHREAD_SANITIZER")
+
+  # Disables using the precompiled template specializations for std::string, shared_ptr, etc
+  # so that the annotations in the header actually take effect.
+  add_definitions("-D_GLIBCXX_EXTERN_TEMPLATE=0")
+
+  # Some of the above also need to be passed to the linker.
+  set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -pie -fsanitize=thread")
+
+  # Strictly speaking, TSAN doesn't require dynamic linking. But it does
+  # require all code to be position independent, and the easiest way to
+  # guarantee that is via dynamic linking (not all 3rd party archives are
+  # compiled with -fPIC e.g. boost).
+  if("${ARROW_LINK}" STREQUAL "a")
+    message("Using dynamic linking for TSAN")
+    set(ARROW_LINK "d")
+  elseif("${ARROW_LINK}" STREQUAL "s")
+    message(SEND_ERROR "Cannot use TSAN with static linking")
+  endif()
+endif()
+
+
+if ("${ARROW_USE_UBSAN}" OR "${ARROW_USE_ASAN}" OR "${ARROW_USE_TSAN}")
+  # GCC 4.8 and 4.9 (latest as of this writing) don't allow you to specify a
+  # sanitizer blacklist.
+  if("${COMPILER_FAMILY}" STREQUAL "clang")
+    # Require clang 3.4 or newer; clang 3.3 has issues with TSAN and pthread
+    # symbol interception.
+    if("${COMPILER_VERSION}" VERSION_LESS "3.4")
+      message(SEND_ERROR "Must use clang 3.4 or newer to run a sanitizer build."
+        " Try using clang from $NATIVE_TOOLCHAIN/")
+    endif()
+    add_definitions("-fsanitize-blacklist=${BUILD_SUPPORT_DIR}/sanitize-blacklist.txt")
+  else()
+    message(WARNING "GCC does not support specifying a sanitizer blacklist. Known sanitizer check failures will not be suppressed.")
+  endif()
+endif()

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/setup_build_env.sh
----------------------------------------------------------------------
diff --git a/cpp/setup_build_env.sh b/cpp/setup_build_env.sh
new file mode 100755
index 0000000..457b971
--- /dev/null
+++ b/cpp/setup_build_env.sh
@@ -0,0 +1,12 @@
+#!/bin/bash
+
+set -e
+
+SOURCE_DIR=$(cd "$(dirname "$BASH_SOURCE")"; pwd)
+
+./thirdparty/download_thirdparty.sh
+./thirdparty/build_thirdparty.sh
+
+export GTEST_HOME=$SOURCE_DIR/thirdparty/$GTEST_BASEDIR
+
+echo "Build env initialized"

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/CMakeLists.txt
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/CMakeLists.txt b/cpp/src/arrow/CMakeLists.txt
new file mode 100644
index 0000000..eeea2db
--- /dev/null
+++ b/cpp/src/arrow/CMakeLists.txt
@@ -0,0 +1,33 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Headers: top level
+install(FILES
+  api.h
+  array.h
+  builder.h
+  type.h
+  DESTINATION include/arrow)
+
+#######################################
+# Unit tests
+#######################################
+
+set(ARROW_TEST_LINK_LIBS arrow_test_util ${ARROW_MIN_TEST_LIBS})
+
+ADD_ARROW_TEST(array-test)
+ADD_ARROW_TEST(field-test)

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/api.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/api.h b/cpp/src/arrow/api.h
new file mode 100644
index 0000000..899e8aa
--- /dev/null
+++ b/cpp/src/arrow/api.h
@@ -0,0 +1,21 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_API_H
+#define ARROW_API_H
+
+#endif // ARROW_API_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/array-test.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/array-test.cc b/cpp/src/arrow/array-test.cc
new file mode 100644
index 0000000..5ecf916
--- /dev/null
+++ b/cpp/src/arrow/array-test.cc
@@ -0,0 +1,92 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <gtest/gtest.h>
+
+#include <cstdint>
+#include <memory>
+#include <string>
+#include <vector>
+
+#include "arrow/array.h"
+#include "arrow/test-util.h"
+#include "arrow/type.h"
+#include "arrow/types/integer.h"
+#include "arrow/types/primitive.h"
+#include "arrow/util/buffer.h"
+
+using std::string;
+using std::vector;
+
+namespace arrow {
+
+static TypePtr int32 = TypePtr(new Int32Type());
+static TypePtr int32_nn = TypePtr(new Int32Type(false));
+
+
+class TestArray : public ::testing::Test {
+ public:
+  void SetUp() {
+    auto data = std::make_shared<OwnedMutableBuffer>();
+    auto nulls = std::make_shared<OwnedMutableBuffer>();
+
+    ASSERT_OK(data->Resize(400));
+    ASSERT_OK(nulls->Resize(128));
+
+    arr_.reset(new Int32Array(100, data, nulls));
+  }
+
+ protected:
+  std::unique_ptr<Int32Array> arr_;
+};
+
+
+TEST_F(TestArray, TestNullable) {
+  std::shared_ptr<Buffer> tmp = arr_->data();
+  std::unique_ptr<Int32Array> arr_nn(new Int32Array(100, tmp));
+
+  ASSERT_TRUE(arr_->nullable());
+  ASSERT_FALSE(arr_nn->nullable());
+}
+
+
+TEST_F(TestArray, TestLength) {
+  ASSERT_EQ(arr_->length(), 100);
+}
+
+TEST_F(TestArray, TestIsNull) {
+  vector<uint8_t> nulls = {1, 0, 1, 1, 0, 1, 0, 0,
+                           1, 0, 1, 1, 0, 1, 0, 0,
+                           1, 0, 1, 1, 0, 1, 0, 0,
+                           1, 0, 1, 1, 0, 1, 0, 0,
+                           1, 0, 0, 1};
+
+  std::shared_ptr<Buffer> null_buf = bytes_to_null_buffer(nulls.data(), nulls.size());
+  std::unique_ptr<Array> arr;
+  arr.reset(new Array(int32, nulls.size(), null_buf));
+
+  ASSERT_EQ(null_buf->size(), 5);
+  for (size_t i = 0; i < nulls.size(); ++i) {
+    ASSERT_EQ(static_cast<bool>(nulls[i]), arr->IsNull(i));
+  }
+}
+
+
+TEST_F(TestArray, TestCopy) {
+}
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/array.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/array.cc b/cpp/src/arrow/array.cc
new file mode 100644
index 0000000..1726a2f
--- /dev/null
+++ b/cpp/src/arrow/array.cc
@@ -0,0 +1,44 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/array.h"
+
+#include "arrow/util/buffer.h"
+
+namespace arrow {
+
+// ----------------------------------------------------------------------
+// Base array class
+
+Array::Array(const TypePtr& type, int64_t length,
+    const std::shared_ptr<Buffer>& nulls) {
+  Init(type, length, nulls);
+}
+
+void Array::Init(const TypePtr& type, int64_t length,
+    const std::shared_ptr<Buffer>& nulls) {
+  type_ = type;
+  length_ = length;
+  nulls_ = nulls;
+
+  nullable_ = type->nullable;
+  if (nulls_) {
+    null_bits_ = nulls_->data();
+  }
+}
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/array.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/array.h b/cpp/src/arrow/array.h
new file mode 100644
index 0000000..c95450d
--- /dev/null
+++ b/cpp/src/arrow/array.h
@@ -0,0 +1,79 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_ARRAY_H
+#define ARROW_ARRAY_H
+
+#include <cstdint>
+#include <cstdlib>
+#include <memory>
+
+#include "arrow/type.h"
+#include "arrow/util/bit-util.h"
+#include "arrow/util/macros.h"
+
+namespace arrow {
+
+class Buffer;
+
+// Immutable data array with some logical type and some length. Any memory is
+// owned by the respective Buffer instance (or its parents). May or may not be
+// nullable.
+//
+// The base class only has a null array (if the data type is nullable)
+//
+// Any buffers used to initialize the array have their references "stolen". If
+// you wish to use the buffer beyond the lifetime of the array, you need to
+// explicitly increment its reference count
+class Array {
+ public:
+  Array() : length_(0), nulls_(nullptr), null_bits_(nullptr) {}
+  Array(const TypePtr& type, int64_t length,
+      const std::shared_ptr<Buffer>& nulls = nullptr);
+
+  virtual ~Array() {}
+
+  void Init(const TypePtr& type, int64_t length, const std::shared_ptr<Buffer>& nulls);
+
+  // Determine if a slot if null. For inner loops. Does *not* boundscheck
+  bool IsNull(int64_t i) const {
+    return nullable_ && util::get_bit(null_bits_, i);
+  }
+
+  int64_t length() const { return length_;}
+  bool nullable() const { return nullable_;}
+  const TypePtr& type() const { return type_;}
+  TypeEnum type_enum() const { return type_->type;}
+
+ protected:
+  TypePtr type_;
+  bool nullable_;
+  int64_t length_;
+
+  std::shared_ptr<Buffer> nulls_;
+  const uint8_t* null_bits_;
+
+ private:
+  DISALLOW_COPY_AND_ASSIGN(Array);
+};
+
+
+typedef std::shared_ptr<Array> ArrayPtr;
+
+} // namespace arrow
+
+#endif

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/builder.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/builder.cc b/cpp/src/arrow/builder.cc
new file mode 100644
index 0000000..1fd7471
--- /dev/null
+++ b/cpp/src/arrow/builder.cc
@@ -0,0 +1,63 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/builder.h"
+
+#include <cstring>
+
+#include "arrow/util/bit-util.h"
+#include "arrow/util/buffer.h"
+#include "arrow/util/status.h"
+
+namespace arrow {
+
+Status ArrayBuilder::Init(int64_t capacity) {
+  capacity_ = capacity;
+
+  if (nullable_) {
+    int64_t to_alloc = util::ceil_byte(capacity) / 8;
+    nulls_ = std::make_shared<OwnedMutableBuffer>();
+    RETURN_NOT_OK(nulls_->Resize(to_alloc));
+    null_bits_ = nulls_->mutable_data();
+    memset(null_bits_, 0, to_alloc);
+  }
+  return Status::OK();
+}
+
+Status ArrayBuilder::Resize(int64_t new_bits) {
+  if (nullable_) {
+    int64_t new_bytes = util::ceil_byte(new_bits) / 8;
+    int64_t old_bytes = nulls_->size();
+    RETURN_NOT_OK(nulls_->Resize(new_bytes));
+    null_bits_ = nulls_->mutable_data();
+    if (old_bytes < new_bytes) {
+      memset(null_bits_ + old_bytes, 0, new_bytes - old_bytes);
+    }
+  }
+  return Status::OK();
+}
+
+Status ArrayBuilder::Advance(int64_t elements) {
+  if (nullable_ && length_ + elements > capacity_) {
+    return Status::Invalid("Builder must be expanded");
+  }
+  length_ += elements;
+  return Status::OK();
+}
+
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/builder.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/builder.h b/cpp/src/arrow/builder.h
new file mode 100644
index 0000000..b43668a
--- /dev/null
+++ b/cpp/src/arrow/builder.h
@@ -0,0 +1,101 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_BUILDER_H
+#define ARROW_BUILDER_H
+
+#include <cstdint>
+#include <memory>
+#include <vector>
+
+#include "arrow/type.h"
+#include "arrow/util/buffer.h"
+#include "arrow/util/macros.h"
+#include "arrow/util/status.h"
+
+namespace arrow {
+
+class Array;
+
+static constexpr int64_t MIN_BUILDER_CAPACITY = 1 << 8;
+
+// Base class for all data array builders
+class ArrayBuilder {
+ public:
+  explicit ArrayBuilder(const TypePtr& type)
+      : type_(type),
+        nullable_(type_->nullable),
+        nulls_(nullptr), null_bits_(nullptr),
+        length_(0),
+        capacity_(0) {}
+
+  virtual ~ArrayBuilder() {}
+
+  // For nested types. Since the objects are owned by this class instance, we
+  // skip shared pointers and just return a raw pointer
+  ArrayBuilder* child(int i) {
+    return children_[i].get();
+  }
+
+  int num_children() const {
+    return children_.size();
+  }
+
+  int64_t length() const { return length_;}
+  int64_t capacity() const { return capacity_;}
+  bool nullable() const { return nullable_;}
+
+  // Allocates requires memory at this level, but children need to be
+  // initialized independently
+  Status Init(int64_t capacity);
+
+  // Resizes the nulls array (if nullable)
+  Status Resize(int64_t new_bits);
+
+  // For cases where raw data was memcpy'd into the internal buffers, allows us
+  // to advance the length of the builder. It is your responsibility to use
+  // this function responsibly.
+  Status Advance(int64_t elements);
+
+  const std::shared_ptr<OwnedMutableBuffer>& nulls() const { return nulls_;}
+
+  // Creates new array object to hold the contents of the builder and transfers
+  // ownership of the data
+  virtual Status ToArray(Array** out) = 0;
+
+ protected:
+  TypePtr type_;
+  bool nullable_;
+
+  // If the type is not nullable, then null_ is nullptr after initialization
+  std::shared_ptr<OwnedMutableBuffer> nulls_;
+  uint8_t* null_bits_;
+
+  // Array length, so far. Also, the index of the next element to be added
+  int64_t length_;
+  int64_t capacity_;
+
+  // Child value array builders. These are owned by this class
+  std::vector<std::unique_ptr<ArrayBuilder> > children_;
+
+ private:
+  DISALLOW_COPY_AND_ASSIGN(ArrayBuilder);
+};
+
+} // namespace arrow
+
+#endif // ARROW_BUILDER_H_

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/field-test.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/field-test.cc b/cpp/src/arrow/field-test.cc
new file mode 100644
index 0000000..2bb8bad
--- /dev/null
+++ b/cpp/src/arrow/field-test.cc
@@ -0,0 +1,38 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <gtest/gtest.h>
+#include <memory>
+#include <string>
+
+#include "arrow/field.h"
+#include "arrow/type.h"
+#include "arrow/types/integer.h"
+
+using std::string;
+
+namespace arrow {
+
+TEST(TestField, Basics) {
+  TypePtr ftype = TypePtr(new Int32Type());
+  Field f0("f0", ftype);
+
+  ASSERT_EQ(f0.name, "f0");
+  ASSERT_EQ(f0.type->ToString(), ftype->ToString());
+}
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/field.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/field.h b/cpp/src/arrow/field.h
new file mode 100644
index 0000000..664cae6
--- /dev/null
+++ b/cpp/src/arrow/field.h
@@ -0,0 +1,48 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_FIELD_H
+#define ARROW_FIELD_H
+
+#include <string>
+
+#include "arrow/type.h"
+
+namespace arrow {
+
+// A field is a piece of metadata that includes (for now) a name and a data
+// type
+
+struct Field {
+  // Field name
+  std::string name;
+
+  // The field's data type
+  TypePtr type;
+
+  Field(const std::string& name, const TypePtr& type) :
+      name(name), type(type) {}
+
+  bool Equals(const Field& other) const {
+    return (this == &other) || (this->name == other.name &&
+        this->type->Equals(other.type.get()));
+  }
+};
+
+} // namespace arrow
+
+#endif  // ARROW_FIELD_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/parquet/CMakeLists.txt
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/parquet/CMakeLists.txt b/cpp/src/arrow/parquet/CMakeLists.txt
new file mode 100644
index 0000000..7b449af
--- /dev/null
+++ b/cpp/src/arrow/parquet/CMakeLists.txt
@@ -0,0 +1,35 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# ----------------------------------------------------------------------
+# arrow_parquet : Arrow <-> Parquet adapter
+
+set(PARQUET_SRCS
+)
+
+set(PARQUET_LIBS
+)
+
+add_library(arrow_parquet STATIC
+  ${PARQUET_SRCS}
+)
+target_link_libraries(arrow_parquet ${PARQUET_LIBS})
+SET_TARGET_PROPERTIES(arrow_parquet PROPERTIES LINKER_LANGUAGE CXX)
+
+# Headers: top level
+install(FILES
+  DESTINATION include/arrow/parquet)

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/test-util.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/test-util.h b/cpp/src/arrow/test-util.h
new file mode 100644
index 0000000..2233a4f
--- /dev/null
+++ b/cpp/src/arrow/test-util.h
@@ -0,0 +1,97 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_TEST_UTIL_H_
+#define ARROW_TEST_UTIL_H_
+
+#include <gtest/gtest.h>
+#include <memory>
+#include <string>
+#include <vector>
+
+#include "arrow/util/bit-util.h"
+#include "arrow/util/random.h"
+#include "arrow/util/status.h"
+
+#define ASSERT_RAISES(ENUM, expr)               \
+  do {                                          \
+    Status s = (expr);                          \
+    ASSERT_TRUE(s.Is##ENUM());                  \
+  } while (0)
+
+
+#define ASSERT_OK(expr)                         \
+  do {                                          \
+    Status s = (expr);                          \
+    ASSERT_TRUE(s.ok());                        \
+  } while (0)
+
+
+#define EXPECT_OK(expr)                         \
+  do {                                          \
+    Status s = (expr);                          \
+    EXPECT_TRUE(s.ok());                        \
+  } while (0)
+
+
+namespace arrow {
+
+template <typename T>
+void randint(int64_t N, T lower, T upper, std::vector<T>* out) {
+  Random rng(random_seed());
+  uint64_t draw;
+  uint64_t span = upper - lower;
+  T val;
+  for (int64_t i = 0; i < N; ++i) {
+    draw = rng.Uniform64(span);
+    val = lower + static_cast<T>(draw);
+    out->push_back(val);
+  }
+}
+
+
+template <typename T>
+std::shared_ptr<Buffer> to_buffer(const std::vector<T>& values) {
+  return std::make_shared<Buffer>(reinterpret_cast<const uint8_t*>(values.data()),
+      values.size() * sizeof(T));
+}
+
+void random_nulls(int64_t n, double pct_null, std::vector<uint8_t>* nulls) {
+  Random rng(random_seed());
+  for (int i = 0; i < n; ++i) {
+    nulls->push_back(static_cast<uint8_t>(rng.NextDoubleFraction() > pct_null));
+  }
+}
+
+void random_nulls(int64_t n, double pct_null, std::vector<bool>* nulls) {
+  Random rng(random_seed());
+  for (int i = 0; i < n; ++i) {
+    nulls->push_back(rng.NextDoubleFraction() > pct_null);
+  }
+}
+
+std::shared_ptr<Buffer> bytes_to_null_buffer(uint8_t* bytes, int length) {
+  std::shared_ptr<Buffer> out;
+
+  // TODO(wesm): error checking
+  util::bytes_to_bits(bytes, length, &out);
+  return out;
+}
+
+} // namespace arrow
+
+#endif // ARROW_TEST_UTIL_H_

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/type.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/type.cc b/cpp/src/arrow/type.cc
new file mode 100644
index 0000000..492eee5
--- /dev/null
+++ b/cpp/src/arrow/type.cc
@@ -0,0 +1,22 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/type.h"
+
+namespace arrow {
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/type.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/type.h b/cpp/src/arrow/type.h
new file mode 100644
index 0000000..220f99f
--- /dev/null
+++ b/cpp/src/arrow/type.h
@@ -0,0 +1,180 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_TYPE_H
+#define ARROW_TYPE_H
+
+#include <memory>
+#include <string>
+
+namespace arrow {
+
+// Physical data type that describes the memory layout of values. See details
+// for each type
+enum class LayoutEnum: char {
+  // A physical type consisting of some non-negative number of bytes
+  BYTE = 0,
+
+  // A physical type consisting of some non-negative number of bits
+  BIT = 1,
+
+  // A parametric variable-length value type. Full specification requires a
+  // child logical type
+  LIST = 2,
+
+  // A collection of multiple equal-length child arrays. Parametric type taking
+  // 1 or more child logical types
+  STRUCT = 3,
+
+  // An array with heterogeneous value types. Parametric types taking 1 or more
+  // child logical types
+  DENSE_UNION = 4,
+  SPARSE_UNION = 5
+};
+
+
+struct LayoutType {
+  LayoutEnum type;
+  explicit LayoutType(LayoutEnum type) : type(type) {}
+};
+
+
+// Data types in this library are all *logical*. They can be expressed as
+// either a primitive physical type (bytes or bits of some fixed size), a
+// nested type consisting of other data types, or another data type (e.g. a
+// timestamp encoded as an int64)
+//
+// Any data type can be nullable
+
+enum class TypeEnum: char {
+  // A degerate NULL type represented as 0 bytes/bits
+  NA = 0,
+
+  // Little-endian integer types
+  UINT8 = 1,
+  INT8 = 2,
+  UINT16 = 3,
+  INT16 = 4,
+  UINT32 = 5,
+  INT32 = 6,
+  UINT64 = 7,
+  INT64 = 8,
+
+  // A boolean value represented as 1 byte
+  BOOL = 9,
+
+  // A boolean value represented as 1 bit
+  BIT = 10,
+
+  // 4-byte floating point value
+  FLOAT = 11,
+
+  // 8-byte floating point value
+  DOUBLE = 12,
+
+  // CHAR(N): fixed-length UTF8 string with length N
+  CHAR = 13,
+
+  // UTF8 variable-length string as List<Char>
+  STRING = 14,
+
+  // VARCHAR(N): Null-terminated string type embedded in a CHAR(N + 1)
+  VARCHAR = 15,
+
+  // Variable-length bytes (no guarantee of UTF8-ness)
+  BINARY = 16,
+
+  // By default, int32 days since the UNIX epoch
+  DATE = 17,
+
+  // Exact timestamp encoded with int64 since UNIX epoch
+  // Default unit millisecond
+  TIMESTAMP = 18,
+
+  // Timestamp as double seconds since the UNIX epoch
+  TIMESTAMP_DOUBLE = 19,
+
+  // Exact time encoded with int64, default unit millisecond
+  TIME = 20,
+
+  // Precision- and scale-based decimal type. Storage type depends on the
+  // parameters.
+  DECIMAL = 21,
+
+  // Decimal value encoded as a text string
+  DECIMAL_TEXT = 22,
+
+  // A list of some logical data type
+  LIST = 30,
+
+  // Struct of logical types
+  STRUCT = 31,
+
+  // Unions of logical types
+  DENSE_UNION = 32,
+  SPARSE_UNION = 33,
+
+  // Union<Null, Int32, Double, String, Bool>
+  JSON_SCALAR = 50,
+
+  // User-defined type
+  USER = 60
+};
+
+
+struct DataType {
+  TypeEnum type;
+  bool nullable;
+
+  explicit DataType(TypeEnum type, bool nullable = true)
+      : type(type), nullable(nullable) {}
+
+  virtual bool Equals(const DataType* other) {
+    return (this == other) || (this->type == other->type &&
+        this->nullable == other->nullable);
+  }
+
+  virtual std::string ToString() const = 0;
+};
+
+
+typedef std::shared_ptr<LayoutType> LayoutPtr;
+typedef std::shared_ptr<DataType> TypePtr;
+
+
+struct BytesType : public LayoutType {
+  int size;
+
+  explicit BytesType(int size)
+      : LayoutType(LayoutEnum::BYTE),
+        size(size) {}
+
+  BytesType(const BytesType& other)
+      : BytesType(other.size) {}
+};
+
+struct ListLayoutType : public LayoutType {
+  LayoutPtr value_type;
+
+  explicit ListLayoutType(const LayoutPtr& value_type)
+      : LayoutType(LayoutEnum::BYTE),
+        value_type(value_type) {}
+};
+
+} // namespace arrow
+
+#endif  // ARROW_TYPE_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/CMakeLists.txt
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/CMakeLists.txt b/cpp/src/arrow/types/CMakeLists.txt
new file mode 100644
index 0000000..e090aea
--- /dev/null
+++ b/cpp/src/arrow/types/CMakeLists.txt
@@ -0,0 +1,63 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+#######################################
+# arrow_types
+#######################################
+
+set(TYPES_SRCS
+  construct.cc
+  floating.cc
+  integer.cc
+  json.cc
+  list.cc
+  primitive.cc
+  string.cc
+  struct.cc
+  union.cc
+)
+
+set(TYPES_LIBS
+)
+
+add_library(arrow_types STATIC
+  ${TYPES_SRCS}
+)
+target_link_libraries(arrow_types ${TYPES_LIBS})
+SET_TARGET_PROPERTIES(arrow_types PROPERTIES LINKER_LANGUAGE CXX)
+
+# Headers: top level
+install(FILES
+  boolean.h
+  collection.h
+  datetime.h
+  decimal.h
+  floating.h
+  integer.h
+  json.h
+  list.h
+  primitive.h
+  string.h
+  struct.h
+  union.h
+  DESTINATION include/arrow/types)
+
+
+ADD_ARROW_TEST(list-test)
+ADD_ARROW_TEST(primitive-test)
+ADD_ARROW_TEST(string-test)
+ADD_ARROW_TEST(struct-test)

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/binary.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/binary.h b/cpp/src/arrow/types/binary.h
new file mode 100644
index 0000000..a9f2004
--- /dev/null
+++ b/cpp/src/arrow/types/binary.h
@@ -0,0 +1,33 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_TYPES_BINARY_H
+#define ARROW_TYPES_BINARY_H
+
+#include <string>
+#include <vector>
+
+#include "arrow/type.h"
+
+namespace arrow {
+
+struct StringType : public DataType {
+};
+
+} // namespace arrow
+
+#endif // ARROW_TYPES_BINARY_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/boolean.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/boolean.h b/cpp/src/arrow/types/boolean.h
new file mode 100644
index 0000000..31388c8
--- /dev/null
+++ b/cpp/src/arrow/types/boolean.h
@@ -0,0 +1,35 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_TYPES_BOOLEAN_H
+#define ARROW_TYPES_BOOLEAN_H
+
+#include "arrow/types/primitive.h"
+
+namespace arrow {
+
+struct BooleanType : public PrimitiveType<BooleanType> {
+  PRIMITIVE_DECL(BooleanType, uint8_t, BOOL, 1, "bool");
+};
+
+typedef PrimitiveArrayImpl<BooleanType> BooleanArray;
+
+// typedef PrimitiveBuilder<BooleanType, BooleanArray> BooleanBuilder;
+
+} // namespace arrow
+
+#endif // ARROW_TYPES_BOOLEAN_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/collection.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/collection.h b/cpp/src/arrow/types/collection.h
new file mode 100644
index 0000000..59ba614
--- /dev/null
+++ b/cpp/src/arrow/types/collection.h
@@ -0,0 +1,45 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_TYPES_COLLECTION_H
+#define ARROW_TYPES_COLLECTION_H
+
+#include <string>
+#include <vector>
+
+#include "arrow/type.h"
+
+namespace arrow {
+
+template <TypeEnum T>
+struct CollectionType : public DataType {
+  std::vector<TypePtr> child_types_;
+
+  explicit CollectionType(bool nullable = true) : DataType(T, nullable) {}
+
+  const TypePtr& child(int i) const {
+    return child_types_[i];
+  }
+
+  int num_children() const {
+    return child_types_.size();
+  }
+};
+
+} // namespace arrow
+
+#endif // ARROW_TYPES_COLLECTION_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/construct.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/construct.cc b/cpp/src/arrow/types/construct.cc
new file mode 100644
index 0000000..5176caf
--- /dev/null
+++ b/cpp/src/arrow/types/construct.cc
@@ -0,0 +1,88 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/types/construct.h"
+
+#include <memory>
+
+#include "arrow/types/floating.h"
+#include "arrow/types/integer.h"
+#include "arrow/types/list.h"
+#include "arrow/types/string.h"
+#include "arrow/util/status.h"
+
+namespace arrow {
+
+class ArrayBuilder;
+
+// Initially looked at doing this with vtables, but shared pointers makes it
+// difficult
+
+#define BUILDER_CASE(ENUM, BuilderType)                         \
+    case TypeEnum::ENUM:                                        \
+      *out = static_cast<ArrayBuilder*>(new BuilderType(type)); \
+      return Status::OK();
+
+Status make_builder(const TypePtr& type, ArrayBuilder** out) {
+  switch (type->type) {
+    BUILDER_CASE(UINT8, UInt8Builder);
+    BUILDER_CASE(INT8, Int8Builder);
+    BUILDER_CASE(UINT16, UInt16Builder);
+    BUILDER_CASE(INT16, Int16Builder);
+    BUILDER_CASE(UINT32, UInt32Builder);
+    BUILDER_CASE(INT32, Int32Builder);
+    BUILDER_CASE(UINT64, UInt64Builder);
+    BUILDER_CASE(INT64, Int64Builder);
+
+    // BUILDER_CASE(BOOL, BooleanBuilder);
+
+    BUILDER_CASE(FLOAT, FloatBuilder);
+    BUILDER_CASE(DOUBLE, DoubleBuilder);
+
+    BUILDER_CASE(STRING, StringBuilder);
+
+    case TypeEnum::LIST:
+      {
+        ListType* list_type = static_cast<ListType*>(type.get());
+        ArrayBuilder* value_builder;
+        RETURN_NOT_OK(make_builder(list_type->value_type, &value_builder));
+
+        // The ListBuilder takes ownership of the value_builder
+        ListBuilder* builder = new ListBuilder(type, value_builder);
+        *out = static_cast<ArrayBuilder*>(builder);
+        return Status::OK();
+      }
+    // BUILDER_CASE(CHAR, CharBuilder);
+
+    // BUILDER_CASE(VARCHAR, VarcharBuilder);
+    // BUILDER_CASE(BINARY, BinaryBuilder);
+
+    // BUILDER_CASE(DATE, DateBuilder);
+    // BUILDER_CASE(TIMESTAMP, TimestampBuilder);
+    // BUILDER_CASE(TIME, TimeBuilder);
+
+    // BUILDER_CASE(LIST, ListBuilder);
+    // BUILDER_CASE(STRUCT, StructBuilder);
+    // BUILDER_CASE(DENSE_UNION, DenseUnionBuilder);
+    // BUILDER_CASE(SPARSE_UNION, SparseUnionBuilder);
+
+    default:
+      return Status::NotImplemented(type->ToString());
+  }
+}
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/construct.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/construct.h b/cpp/src/arrow/types/construct.h
new file mode 100644
index 0000000..c0bfedd
--- /dev/null
+++ b/cpp/src/arrow/types/construct.h
@@ -0,0 +1,32 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_TYPES_CONSTRUCT_H
+#define ARROW_TYPES_CONSTRUCT_H
+
+#include "arrow/type.h"
+
+namespace arrow {
+
+class ArrayBuilder;
+class Status;
+
+Status make_builder(const TypePtr& type, ArrayBuilder** out);
+
+} // namespace arrow
+
+#endif // ARROW_BUILDER_H_

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/datetime.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/datetime.h b/cpp/src/arrow/types/datetime.h
new file mode 100644
index 0000000..b4d6252
--- /dev/null
+++ b/cpp/src/arrow/types/datetime.h
@@ -0,0 +1,79 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_TYPES_DATETIME_H
+#define ARROW_TYPES_DATETIME_H
+
+#include "arrow/type.h"
+
+namespace arrow {
+
+struct DateType : public DataType {
+  enum class Unit: char {
+    DAY = 0,
+    MONTH = 1,
+    YEAR = 2
+  };
+
+  Unit unit;
+
+  explicit DateType(Unit unit = Unit::DAY, bool nullable = true)
+      : DataType(TypeEnum::DATE, nullable),
+        unit(unit) {}
+
+  DateType(const DateType& other)
+      : DateType(other.unit, other.nullable) {}
+
+  static char const *name() {
+    return "date";
+  }
+
+  // virtual std::string ToString() {
+  //   return name();
+  // }
+};
+
+
+struct TimestampType : public DataType {
+  enum class Unit: char {
+    SECOND = 0,
+    MILLI = 1,
+    MICRO = 2,
+    NANO = 3
+  };
+
+  Unit unit;
+
+  explicit TimestampType(Unit unit = Unit::MILLI, bool nullable = true)
+      : DataType(TypeEnum::TIMESTAMP, nullable),
+        unit(unit) {}
+
+  TimestampType(const TimestampType& other)
+      : TimestampType(other.unit, other.nullable) {}
+
+  static char const *name() {
+    return "timestamp";
+  }
+
+  // virtual std::string ToString() {
+  //   return name();
+  // }
+};
+
+} // namespace arrow
+
+#endif // ARROW_TYPES_DATETIME_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/decimal.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/decimal.h b/cpp/src/arrow/types/decimal.h
new file mode 100644
index 0000000..464c3ff
--- /dev/null
+++ b/cpp/src/arrow/types/decimal.h
@@ -0,0 +1,32 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_TYPES_DECIMAL_H
+#define ARROW_TYPES_DECIMAL_H
+
+#include "arrow/type.h"
+
+namespace arrow {
+
+struct DecimalType : public DataType {
+  int precision;
+  int scale;
+};
+
+} // namespace arrow
+
+#endif // ARROW_TYPES_DECIMAL_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/floating.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/floating.cc b/cpp/src/arrow/types/floating.cc
new file mode 100644
index 0000000..bde2826
--- /dev/null
+++ b/cpp/src/arrow/types/floating.cc
@@ -0,0 +1,22 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/types/floating.h"
+
+namespace arrow {
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/floating.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/floating.h b/cpp/src/arrow/types/floating.h
new file mode 100644
index 0000000..7551ce6
--- /dev/null
+++ b/cpp/src/arrow/types/floating.h
@@ -0,0 +1,43 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_TYPES_FLOATING_H
+#define ARROW_TYPES_FLOATING_H
+
+#include <string>
+
+#include "arrow/types/primitive.h"
+
+namespace arrow {
+
+struct FloatType : public PrimitiveType<FloatType> {
+  PRIMITIVE_DECL(FloatType, float, FLOAT, 4, "float");
+};
+
+struct DoubleType : public PrimitiveType<DoubleType> {
+  PRIMITIVE_DECL(DoubleType, double, DOUBLE, 8, "double");
+};
+
+typedef PrimitiveArrayImpl<FloatType> FloatArray;
+typedef PrimitiveArrayImpl<DoubleType> DoubleArray;
+
+typedef PrimitiveBuilder<FloatType, FloatArray> FloatBuilder;
+typedef PrimitiveBuilder<DoubleType, DoubleArray> DoubleBuilder;
+
+} // namespace arrow
+
+#endif // ARROW_TYPES_FLOATING_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/integer.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/integer.cc b/cpp/src/arrow/types/integer.cc
new file mode 100644
index 0000000..4696536
--- /dev/null
+++ b/cpp/src/arrow/types/integer.cc
@@ -0,0 +1,22 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/types/integer.h"
+
+namespace arrow {
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/integer.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/integer.h b/cpp/src/arrow/types/integer.h
new file mode 100644
index 0000000..7e5eab5
--- /dev/null
+++ b/cpp/src/arrow/types/integer.h
@@ -0,0 +1,88 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_TYPES_INTEGER_H
+#define ARROW_TYPES_INTEGER_H
+
+#include <cstdint>
+#include <string>
+
+#include "arrow/types/primitive.h"
+
+namespace arrow {
+
+struct UInt8Type : public PrimitiveType<UInt8Type> {
+  PRIMITIVE_DECL(UInt8Type, uint8_t, UINT8, 1, "uint8");
+};
+
+struct Int8Type : public PrimitiveType<Int8Type> {
+  PRIMITIVE_DECL(Int8Type, int8_t, INT8, 1, "int8");
+};
+
+struct UInt16Type : public PrimitiveType<UInt16Type> {
+  PRIMITIVE_DECL(UInt16Type, uint16_t, UINT16, 2, "uint16");
+};
+
+struct Int16Type : public PrimitiveType<Int16Type> {
+  PRIMITIVE_DECL(Int16Type, int16_t, INT16, 2, "int16");
+};
+
+struct UInt32Type : public PrimitiveType<UInt32Type> {
+  PRIMITIVE_DECL(UInt32Type, uint32_t, UINT32, 4, "uint32");
+};
+
+struct Int32Type : public PrimitiveType<Int32Type> {
+  PRIMITIVE_DECL(Int32Type, int32_t, INT32, 4, "int32");
+};
+
+struct UInt64Type : public PrimitiveType<UInt64Type> {
+  PRIMITIVE_DECL(UInt64Type, uint64_t, UINT64, 8, "uint64");
+};
+
+struct Int64Type : public PrimitiveType<Int64Type> {
+  PRIMITIVE_DECL(Int64Type, int64_t, INT64, 8, "int64");
+};
+
+// Array containers
+
+typedef PrimitiveArrayImpl<UInt8Type> UInt8Array;
+typedef PrimitiveArrayImpl<Int8Type> Int8Array;
+
+typedef PrimitiveArrayImpl<UInt16Type> UInt16Array;
+typedef PrimitiveArrayImpl<Int16Type> Int16Array;
+
+typedef PrimitiveArrayImpl<UInt32Type> UInt32Array;
+typedef PrimitiveArrayImpl<Int32Type> Int32Array;
+
+typedef PrimitiveArrayImpl<UInt64Type> UInt64Array;
+typedef PrimitiveArrayImpl<Int64Type> Int64Array;
+
+// Builders
+
+typedef PrimitiveBuilder<UInt8Type, UInt8Array> UInt8Builder;
+typedef PrimitiveBuilder<UInt16Type, UInt16Array> UInt16Builder;
+typedef PrimitiveBuilder<UInt32Type, UInt32Array> UInt32Builder;
+typedef PrimitiveBuilder<UInt64Type, UInt64Array> UInt64Builder;
+
+typedef PrimitiveBuilder<Int8Type, Int8Array> Int8Builder;
+typedef PrimitiveBuilder<Int16Type, Int16Array> Int16Builder;
+typedef PrimitiveBuilder<Int32Type, Int32Array> Int32Builder;
+typedef PrimitiveBuilder<Int64Type, Int64Array> Int64Builder;
+
+} // namespace arrow
+
+#endif // ARROW_TYPES_INTEGER_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/json.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/json.cc b/cpp/src/arrow/types/json.cc
new file mode 100644
index 0000000..b29b957
--- /dev/null
+++ b/cpp/src/arrow/types/json.cc
@@ -0,0 +1,42 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/types/json.h"
+
+#include <vector>
+
+#include "arrow/types/boolean.h"
+#include "arrow/types/integer.h"
+#include "arrow/types/floating.h"
+#include "arrow/types/null.h"
+#include "arrow/types/string.h"
+#include "arrow/types/union.h"
+
+namespace arrow {
+
+static const TypePtr Null(new NullType());
+static const TypePtr Int32(new Int32Type());
+static const TypePtr String(new StringType());
+static const TypePtr Double(new DoubleType());
+static const TypePtr Bool(new BooleanType());
+
+static const std::vector<TypePtr> json_types = {Null, Int32, String,
+                                                Double, Bool};
+TypePtr JSONScalar::dense_type = TypePtr(new DenseUnionType(json_types));
+TypePtr JSONScalar::sparse_type = TypePtr(new SparseUnionType(json_types));
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/json.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/json.h b/cpp/src/arrow/types/json.h
new file mode 100644
index 0000000..91fd132
--- /dev/null
+++ b/cpp/src/arrow/types/json.h
@@ -0,0 +1,38 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_TYPES_JSON_H
+#define ARROW_TYPES_JSON_H
+
+#include "arrow/type.h"
+
+namespace arrow {
+
+struct JSONScalar : public DataType {
+  bool dense;
+
+  static TypePtr dense_type;
+  static TypePtr sparse_type;
+
+  explicit JSONScalar(bool dense = true, bool nullable = true)
+      : DataType(TypeEnum::JSON_SCALAR, nullable),
+        dense(dense) {}
+};
+
+} // namespace arrow
+
+#endif // ARROW_TYPES_JSON_H


[05/17] arrow git commit: ARROW-1: Initial Arrow Code Commit

Posted by ja...@apache.org.
http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/ValueHolderHelper.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/ValueHolderHelper.java b/java/vector/src/main/java/org/apache/arrow/vector/ValueHolderHelper.java
new file mode 100644
index 0000000..61ce285
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/ValueHolderHelper.java
@@ -0,0 +1,203 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector;
+
+import io.netty.buffer.ArrowBuf;
+
+import java.math.BigDecimal;
+
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.vector.holders.BigIntHolder;
+import org.apache.arrow.vector.holders.BitHolder;
+import org.apache.arrow.vector.holders.DateHolder;
+import org.apache.arrow.vector.holders.Decimal18Holder;
+import org.apache.arrow.vector.holders.Decimal28SparseHolder;
+import org.apache.arrow.vector.holders.Decimal38SparseHolder;
+import org.apache.arrow.vector.holders.Decimal9Holder;
+import org.apache.arrow.vector.holders.Float4Holder;
+import org.apache.arrow.vector.holders.Float8Holder;
+import org.apache.arrow.vector.holders.IntHolder;
+import org.apache.arrow.vector.holders.IntervalDayHolder;
+import org.apache.arrow.vector.holders.IntervalYearHolder;
+import org.apache.arrow.vector.holders.NullableBitHolder;
+import org.apache.arrow.vector.holders.TimeHolder;
+import org.apache.arrow.vector.holders.TimeStampHolder;
+import org.apache.arrow.vector.holders.VarCharHolder;
+import org.apache.arrow.vector.util.DecimalUtility;
+
+import com.google.common.base.Charsets;
+
+
+public class ValueHolderHelper {
+  static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(ValueHolderHelper.class);
+
+  public static IntHolder getIntHolder(int value) {
+    IntHolder holder = new IntHolder();
+    holder.value = value;
+
+    return holder;
+  }
+
+  public static BigIntHolder getBigIntHolder(long value) {
+    BigIntHolder holder = new BigIntHolder();
+    holder.value = value;
+
+    return holder;
+  }
+
+  public static Float4Holder getFloat4Holder(float value) {
+    Float4Holder holder = new Float4Holder();
+    holder.value = value;
+
+    return holder;
+  }
+
+  public static Float8Holder getFloat8Holder(double value) {
+    Float8Holder holder = new Float8Holder();
+    holder.value = value;
+
+    return holder;
+  }
+
+  public static DateHolder getDateHolder(long value) {
+    DateHolder holder = new DateHolder();
+    holder.value = value;
+    return holder;
+  }
+
+  public static TimeHolder getTimeHolder(int value) {
+    TimeHolder holder = new TimeHolder();
+    holder.value = value;
+    return holder;
+  }
+
+  public static TimeStampHolder getTimeStampHolder(long value) {
+    TimeStampHolder holder = new TimeStampHolder();
+    holder.value = value;
+    return holder;
+  }
+
+  public static BitHolder getBitHolder(int value) {
+    BitHolder holder = new BitHolder();
+    holder.value = value;
+
+    return holder;
+  }
+
+  public static NullableBitHolder getNullableBitHolder(boolean isNull, int value) {
+    NullableBitHolder holder = new NullableBitHolder();
+    holder.isSet = isNull? 0 : 1;
+    if (! isNull) {
+      holder.value = value;
+    }
+
+    return holder;
+  }
+
+  public static VarCharHolder getVarCharHolder(ArrowBuf buf, String s){
+    VarCharHolder vch = new VarCharHolder();
+
+    byte[] b = s.getBytes(Charsets.UTF_8);
+    vch.start = 0;
+    vch.end = b.length;
+    vch.buffer = buf.reallocIfNeeded(b.length);
+    vch.buffer.setBytes(0, b);
+    return vch;
+  }
+
+  public static VarCharHolder getVarCharHolder(BufferAllocator a, String s){
+    VarCharHolder vch = new VarCharHolder();
+
+    byte[] b = s.getBytes(Charsets.UTF_8);
+    vch.start = 0;
+    vch.end = b.length;
+    vch.buffer = a.buffer(b.length); //
+    vch.buffer.setBytes(0, b);
+    return vch;
+  }
+
+
+  public static IntervalYearHolder getIntervalYearHolder(int intervalYear) {
+    IntervalYearHolder holder = new IntervalYearHolder();
+
+    holder.value = intervalYear;
+    return holder;
+  }
+
+  public static IntervalDayHolder getIntervalDayHolder(int days, int millis) {
+      IntervalDayHolder dch = new IntervalDayHolder();
+
+      dch.days = days;
+      dch.milliseconds = millis;
+      return dch;
+  }
+
+  public static Decimal9Holder getDecimal9Holder(int decimal, int scale, int precision) {
+    Decimal9Holder dch = new Decimal9Holder();
+
+    dch.scale = scale;
+    dch.precision = precision;
+    dch.value = decimal;
+
+    return dch;
+  }
+
+  public static Decimal18Holder getDecimal18Holder(long decimal, int scale, int precision) {
+    Decimal18Holder dch = new Decimal18Holder();
+
+    dch.scale = scale;
+    dch.precision = precision;
+    dch.value = decimal;
+
+    return dch;
+  }
+
+  public static Decimal28SparseHolder getDecimal28Holder(ArrowBuf buf, String decimal) {
+
+    Decimal28SparseHolder dch = new Decimal28SparseHolder();
+
+    BigDecimal bigDecimal = new BigDecimal(decimal);
+
+    dch.scale = bigDecimal.scale();
+    dch.precision = bigDecimal.precision();
+    Decimal28SparseHolder.setSign(bigDecimal.signum() == -1, dch.start, dch.buffer);
+    dch.start = 0;
+    dch.buffer = buf.reallocIfNeeded(5 * DecimalUtility.INTEGER_SIZE);
+    DecimalUtility
+        .getSparseFromBigDecimal(bigDecimal, dch.buffer, dch.start, dch.scale, dch.precision, dch.nDecimalDigits);
+
+    return dch;
+  }
+
+  public static Decimal38SparseHolder getDecimal38Holder(ArrowBuf buf, String decimal) {
+
+      Decimal38SparseHolder dch = new Decimal38SparseHolder();
+
+      BigDecimal bigDecimal = new BigDecimal(decimal);
+
+      dch.scale = bigDecimal.scale();
+      dch.precision = bigDecimal.precision();
+      Decimal38SparseHolder.setSign(bigDecimal.signum() == -1, dch.start, dch.buffer);
+      dch.start = 0;
+    dch.buffer = buf.reallocIfNeeded(dch.maxPrecision * DecimalUtility.INTEGER_SIZE);
+    DecimalUtility
+        .getSparseFromBigDecimal(bigDecimal, dch.buffer, dch.start, dch.scale, dch.precision, dch.nDecimalDigits);
+
+      return dch;
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/ValueVector.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/ValueVector.java b/java/vector/src/main/java/org/apache/arrow/vector/ValueVector.java
new file mode 100644
index 0000000..c05f0e7
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/ValueVector.java
@@ -0,0 +1,222 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector;
+
+import java.io.Closeable;
+
+import io.netty.buffer.ArrowBuf;
+
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.memory.OutOfMemoryException;
+import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.types.MaterializedField;
+import org.apache.arrow.vector.util.TransferPair;
+
+/**
+ * An abstraction that is used to store a sequence of values in an individual column.
+ *
+ * A {@link ValueVector value vector} stores underlying data in-memory in a columnar fashion that is compact and
+ * efficient. The column whose data is stored, is referred by {@link #getField()}.
+ *
+ * A vector when instantiated, relies on a {@link org.apache.drill.exec.record.DeadBuf dead buffer}. It is important
+ * that vector is allocated before attempting to read or write.
+ *
+ * There are a few "rules" around vectors:
+ *
+ * <ul>
+ *   <li>values need to be written in order (e.g. index 0, 1, 2, 5)</li>
+ *   <li>null vectors start with all values as null before writing anything</li>
+ *   <li>for variable width types, the offset vector should be all zeros before writing</li>
+ *   <li>you must call setValueCount before a vector can be read</li>
+ *   <li>you should never write to a vector once it has been read.</li>
+ * </ul>
+ *
+ * Please note that the current implementation doesn't enfore those rules, hence we may find few places that
+ * deviate from these rules (e.g. offset vectors in Variable Length and Repeated vector)
+ *
+ * This interface "should" strive to guarantee this order of operation:
+ * <blockquote>
+ * allocate > mutate > setvaluecount > access > clear (or allocate to start the process over).
+ * </blockquote>
+ */
+public interface ValueVector extends Closeable, Iterable<ValueVector> {
+  /**
+   * Allocate new buffers. ValueVector implements logic to determine how much to allocate.
+   * @throws OutOfMemoryException Thrown if no memory can be allocated.
+   */
+  void allocateNew() throws OutOfMemoryException;
+
+  /**
+   * Allocates new buffers. ValueVector implements logic to determine how much to allocate.
+   * @return Returns true if allocation was succesful.
+   */
+  boolean allocateNewSafe();
+
+  BufferAllocator getAllocator();
+
+  /**
+   * Set the initial record capacity
+   * @param numRecords
+   */
+  void setInitialCapacity(int numRecords);
+
+  /**
+   * Returns the maximum number of values that can be stored in this vector instance.
+   */
+  int getValueCapacity();
+
+  /**
+   * Alternative to clear(). Allows use as an AutoCloseable in try-with-resources.
+   */
+  @Override
+  void close();
+
+  /**
+   * Release the underlying DrillBuf and reset the ValueVector to empty.
+   */
+  void clear();
+
+  /**
+   * Get information about how this field is materialized.
+   */
+  MaterializedField getField();
+
+  /**
+   * Returns a {@link org.apache.arrow.vector.util.TransferPair transfer pair}, creating a new target vector of
+   * the same type.
+   */
+  TransferPair getTransferPair(BufferAllocator allocator);
+
+  TransferPair getTransferPair(String ref, BufferAllocator allocator);
+
+  /**
+   * Returns a new {@link org.apache.arrow.vector.util.TransferPair transfer pair} that is used to transfer underlying
+   * buffers into the target vector.
+   */
+  TransferPair makeTransferPair(ValueVector target);
+
+  /**
+   * Returns an {@link org.apache.arrow.vector.ValueVector.Accessor accessor} that is used to read from this vector
+   * instance.
+   */
+  Accessor getAccessor();
+
+  /**
+   * Returns an {@link org.apache.arrow.vector.ValueVector.Mutator mutator} that is used to write to this vector
+   * instance.
+   */
+  Mutator getMutator();
+
+  /**
+   * Returns a {@link org.apache.arrow.vector.complex.reader.FieldReader field reader} that supports reading values
+   * from this vector.
+   */
+  FieldReader getReader();
+
+  /**
+   * Get the metadata for this field. Used in serialization
+   *
+   * @return FieldMetadata for this field.
+   */
+//  SerializedField getMetadata();
+
+  /**
+   * Returns the number of bytes that is used by this vector instance.
+   */
+  int getBufferSize();
+
+  /**
+   * Returns the number of bytes that is used by this vector if it holds the given number
+   * of values. The result will be the same as if Mutator.setValueCount() were called, followed
+   * by calling getBufferSize(), but without any of the closing side-effects that setValueCount()
+   * implies wrt finishing off the population of a vector. Some operations might wish to use
+   * this to determine how much memory has been used by a vector so far, even though it is
+   * not finished being populated.
+   *
+   * @param valueCount the number of values to assume this vector contains
+   * @return the buffer size if this vector is holding valueCount values
+   */
+  int getBufferSizeFor(int valueCount);
+
+  /**
+   * Return the underlying buffers associated with this vector. Note that this doesn't impact the reference counts for
+   * this buffer so it only should be used for in-context access. Also note that this buffer changes regularly thus
+   * external classes shouldn't hold a reference to it (unless they change it).
+   * @param clear Whether to clear vector before returning; the buffers will still be refcounted;
+   *   but the returned array will be the only reference to them
+   *
+   * @return The underlying {@link io.netty.buffer.ArrowBuf buffers} that is used by this vector instance.
+   */
+  ArrowBuf[] getBuffers(boolean clear);
+
+  /**
+   * Load the data provided in the buffer. Typically used when deserializing from the wire.
+   *
+   * @param metadata
+   *          Metadata used to decode the incoming buffer.
+   * @param buffer
+   *          The buffer that contains the ValueVector.
+   */
+//  void load(SerializedField metadata, DrillBuf buffer);
+
+  /**
+   * An abstraction that is used to read from this vector instance.
+   */
+  interface Accessor {
+    /**
+     * Get the Java Object representation of the element at the specified position. Useful for testing.
+     *
+     * @param index
+     *          Index of the value to get
+     */
+    Object getObject(int index);
+
+    /**
+     * Returns the number of values that is stored in this vector.
+     */
+    int getValueCount();
+
+    /**
+     * Returns true if the value at the given index is null, false otherwise.
+     */
+    boolean isNull(int index);
+  }
+
+  /**
+   * An abstractiong that is used to write into this vector instance.
+   */
+  interface Mutator {
+    /**
+     * Sets the number of values that is stored in this vector to the given value count.
+     *
+     * @param valueCount  value count to set.
+     */
+    void setValueCount(int valueCount);
+
+    /**
+     * Resets the mutator to pristine state.
+     */
+    void reset();
+
+    /**
+     * @deprecated  this has nothing to do with value vector abstraction and should be removed.
+     */
+    @Deprecated
+    void generateTestData(int values);
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/VariableWidthVector.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/VariableWidthVector.java b/java/vector/src/main/java/org/apache/arrow/vector/VariableWidthVector.java
new file mode 100644
index 0000000..e227bb4
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/VariableWidthVector.java
@@ -0,0 +1,51 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector;
+
+import io.netty.buffer.ArrowBuf;
+
+public interface VariableWidthVector extends ValueVector{
+
+  /**
+   * Allocate a new memory space for this vector.  Must be called prior to using the ValueVector.
+   *
+   * @param totalBytes   Desired size of the underlying data buffer.
+   * @param valueCount   Number of values in the vector.
+   */
+  void allocateNew(int totalBytes, int valueCount);
+
+  /**
+   * Provide the maximum amount of variable width bytes that can be stored int his vector.
+   * @return
+   */
+  int getByteCapacity();
+
+  VariableWidthMutator getMutator();
+
+  VariableWidthAccessor getAccessor();
+
+  interface VariableWidthAccessor extends Accessor {
+    int getValueLength(int index);
+  }
+
+  int getCurrentSizeInBytes();
+
+  interface VariableWidthMutator extends Mutator {
+    void setValueLengthSafe(int index, int length);
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/VectorDescriptor.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/VectorDescriptor.java b/java/vector/src/main/java/org/apache/arrow/vector/VectorDescriptor.java
new file mode 100644
index 0000000..fdad99a
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/VectorDescriptor.java
@@ -0,0 +1,83 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector;
+
+import java.util.Collection;
+
+import com.google.common.base.Preconditions;
+
+import org.apache.arrow.vector.types.MaterializedField;
+import org.apache.arrow.vector.types.Types.MajorType;
+
+public class VectorDescriptor {
+  private static final String DEFAULT_NAME = "NONE";
+
+  private final MaterializedField field;
+
+  public VectorDescriptor(final MajorType type) {
+    this(DEFAULT_NAME, type);
+  }
+
+  public VectorDescriptor(final String name, final MajorType type) {
+    this(MaterializedField.create(name, type));
+  }
+
+  public VectorDescriptor(final MaterializedField field) {
+    this.field = Preconditions.checkNotNull(field, "field cannot be null");
+  }
+
+  public MaterializedField getField() {
+    return field;
+  }
+
+  public MajorType getType() {
+    return field.getType();
+  }
+
+  public String getName() {
+    return field.getLastName();
+  }
+
+  public Collection<MaterializedField> getChildren() {
+    return field.getChildren();
+  }
+
+  public boolean hasName() {
+    return getName() != DEFAULT_NAME;
+  }
+
+  public VectorDescriptor withName(final String name) {
+    return new VectorDescriptor(field.withPath(name));
+  }
+
+  public VectorDescriptor withType(final MajorType type) {
+    return new VectorDescriptor(field.withType(type));
+  }
+
+  public static VectorDescriptor create(final String name, final MajorType type) {
+    return new VectorDescriptor(name, type);
+  }
+
+  public static VectorDescriptor create(final MajorType type) {
+    return new VectorDescriptor(type);
+  }
+
+  public static VectorDescriptor create(final MaterializedField field) {
+    return new VectorDescriptor(field);
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/VectorTrimmer.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/VectorTrimmer.java b/java/vector/src/main/java/org/apache/arrow/vector/VectorTrimmer.java
new file mode 100644
index 0000000..055857e
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/VectorTrimmer.java
@@ -0,0 +1,33 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector;
+
+import io.netty.buffer.ByteBuf;
+import io.netty.buffer.ArrowBuf;
+
+public class VectorTrimmer {
+  static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(VectorTrimmer.class);
+
+  public static void trim(ByteBuf data, int idx) {
+    data.writerIndex(idx);
+    if (data instanceof ArrowBuf) {
+      // data.capacity(idx);
+      data.writerIndex(idx);
+    }
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/ZeroVector.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/ZeroVector.java b/java/vector/src/main/java/org/apache/arrow/vector/ZeroVector.java
new file mode 100644
index 0000000..78de870
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/ZeroVector.java
@@ -0,0 +1,181 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector;
+
+import io.netty.buffer.ArrowBuf;
+
+import java.util.Iterator;
+
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.memory.OutOfMemoryException;
+import org.apache.arrow.vector.complex.impl.NullReader;
+import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.types.MaterializedField;
+import org.apache.arrow.vector.types.Types;
+import org.apache.arrow.vector.types.Types.MinorType;
+import org.apache.arrow.vector.util.TransferPair;
+
+import com.google.common.collect.Iterators;
+
+public class ZeroVector implements ValueVector {
+  public final static ZeroVector INSTANCE = new ZeroVector();
+
+  private final MaterializedField field = MaterializedField.create("[DEFAULT]", Types.required(MinorType.LATE));
+
+  private final TransferPair defaultPair = new TransferPair() {
+    @Override
+    public void transfer() { }
+
+    @Override
+    public void splitAndTransfer(int startIndex, int length) { }
+
+    @Override
+    public ValueVector getTo() {
+      return ZeroVector.this;
+    }
+
+    @Override
+    public void copyValueSafe(int from, int to) { }
+  };
+
+  private final Accessor defaultAccessor = new Accessor() {
+    @Override
+    public Object getObject(int index) {
+      return null;
+    }
+
+    @Override
+    public int getValueCount() {
+      return 0;
+    }
+
+    @Override
+    public boolean isNull(int index) {
+      return true;
+    }
+  };
+
+  private final Mutator defaultMutator = new Mutator() {
+    @Override
+    public void setValueCount(int valueCount) { }
+
+    @Override
+    public void reset() { }
+
+    @Override
+    public void generateTestData(int values) { }
+  };
+
+  public ZeroVector() { }
+
+  @Override
+  public void close() { }
+
+  @Override
+  public void clear() { }
+
+  @Override
+  public MaterializedField getField() {
+    return field;
+  }
+
+  @Override
+  public TransferPair getTransferPair(BufferAllocator allocator) {
+    return defaultPair;
+  }
+
+//  @Override
+//  public UserBitShared.SerializedField getMetadata() {
+//    return getField()
+//        .getAsBuilder()
+//        .setBufferLength(getBufferSize())
+//        .setValueCount(getAccessor().getValueCount())
+//        .build();
+//  }
+
+  @Override
+  public Iterator iterator() {
+    return Iterators.emptyIterator();
+  }
+
+  @Override
+  public int getBufferSize() {
+    return 0;
+  }
+
+  @Override
+  public int getBufferSizeFor(final int valueCount) {
+    return 0;
+  }
+
+  @Override
+  public ArrowBuf[] getBuffers(boolean clear) {
+    return new ArrowBuf[0];
+  }
+
+  @Override
+  public void allocateNew() throws OutOfMemoryException {
+    allocateNewSafe();
+  }
+
+  @Override
+  public boolean allocateNewSafe() {
+    return true;
+  }
+
+  @Override
+  public BufferAllocator getAllocator() {
+    throw new UnsupportedOperationException("Tried to get allocator from ZeroVector");
+  }
+
+  @Override
+  public void setInitialCapacity(int numRecords) { }
+
+  @Override
+  public int getValueCapacity() {
+    return 0;
+  }
+
+  @Override
+  public TransferPair getTransferPair(String ref, BufferAllocator allocator) {
+    return defaultPair;
+  }
+
+  @Override
+  public TransferPair makeTransferPair(ValueVector target) {
+    return defaultPair;
+  }
+
+  @Override
+  public Accessor getAccessor() {
+    return defaultAccessor;
+  }
+
+  @Override
+  public Mutator getMutator() {
+    return defaultMutator;
+  }
+
+  @Override
+  public FieldReader getReader() {
+    return NullReader.INSTANCE;
+  }
+
+//  @Override
+//  public void load(UserBitShared.SerializedField metadata, DrillBuf buffer) { }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/AbstractContainerVector.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/AbstractContainerVector.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/AbstractContainerVector.java
new file mode 100644
index 0000000..c671c9e
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/AbstractContainerVector.java
@@ -0,0 +1,143 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex;
+
+import java.util.Collection;
+
+import javax.annotation.Nullable;
+
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.memory.OutOfMemoryException;
+import org.apache.arrow.vector.ValueVector;
+import org.apache.arrow.vector.types.MaterializedField;
+import org.apache.arrow.vector.types.Types.DataMode;
+import org.apache.arrow.vector.types.Types.MajorType;
+import org.apache.arrow.vector.types.Types.MinorType;
+import org.apache.arrow.vector.util.CallBack;
+
+import com.google.common.base.Function;
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Iterables;
+import com.google.common.collect.Sets;
+
+/**
+ * Base class for composite vectors.
+ *
+ * This class implements common functionality of composite vectors.
+ */
+public abstract class AbstractContainerVector implements ValueVector {
+  static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(AbstractContainerVector.class);
+
+  protected MaterializedField field;
+  protected final BufferAllocator allocator;
+  protected final CallBack callBack;
+
+  protected AbstractContainerVector(MaterializedField field, BufferAllocator allocator, CallBack callBack) {
+    this.field = Preconditions.checkNotNull(field);
+    this.allocator = allocator;
+    this.callBack = callBack;
+  }
+
+  @Override
+  public void allocateNew() throws OutOfMemoryException {
+    if (!allocateNewSafe()) {
+      throw new OutOfMemoryException();
+    }
+  }
+
+  public BufferAllocator getAllocator() {
+    return allocator;
+  }
+
+  /**
+   * Returns the field definition of this instance.
+   */
+  @Override
+  public MaterializedField getField() {
+    return field;
+  }
+
+  /**
+   * Returns a {@link org.apache.arrow.vector.ValueVector} corresponding to the given field name if exists or null.
+   */
+  public ValueVector getChild(String name) {
+    return getChild(name, ValueVector.class);
+  }
+
+  /**
+   * Returns a sequence of field names in the order that they show up in the schema.
+   */
+  protected Collection<String> getChildFieldNames() {
+    return Sets.newLinkedHashSet(Iterables.transform(field.getChildren(), new Function<MaterializedField, String>() {
+      @Nullable
+      @Override
+      public String apply(MaterializedField field) {
+        return Preconditions.checkNotNull(field).getLastName();
+      }
+    }));
+  }
+
+  /**
+   * Clears out all underlying child vectors.
+   */
+ @Override
+  public void close() {
+    for (ValueVector vector:(Iterable<ValueVector>)this) {
+      vector.close();
+    }
+  }
+
+  protected <T extends ValueVector> T typeify(ValueVector v, Class<T> clazz) {
+    if (clazz.isAssignableFrom(v.getClass())) {
+      return (T) v;
+    }
+    throw new IllegalStateException(String.format("Vector requested [%s] was different than type stored [%s].  Drill doesn't yet support hetergenous types.", clazz.getSimpleName(), v.getClass().getSimpleName()));
+  }
+
+  MajorType getLastPathType() {
+    if((this.getField().getType().getMinorType() == MinorType.LIST  &&
+        this.getField().getType().getMode() == DataMode.REPEATED)) {  // Use Repeated scalar type instead of Required List.
+      VectorWithOrdinal vord = getChildVectorWithOrdinal(null);
+      ValueVector v = vord.vector;
+      if (! (v instanceof  AbstractContainerVector)) {
+        return v.getField().getType();
+      }
+    } else if (this.getField().getType().getMinorType() == MinorType.MAP  &&
+        this.getField().getType().getMode() == DataMode.REPEATED) {  // Use Required Map
+      return new MajorType(MinorType.MAP, DataMode.REQUIRED);
+    }
+
+    return this.getField().getType();
+  }
+
+  protected boolean supportsDirectRead() {
+    return false;
+  }
+
+  // return the number of child vectors
+  public abstract int size();
+
+  // add a new vector with the input MajorType or return the existing vector if we already added one with the same type
+  public abstract <T extends ValueVector> T addOrGet(String name, MajorType type, Class<T> clazz);
+
+  // return the child vector with the input name
+  public abstract <T extends ValueVector> T getChild(String name, Class<T> clazz);
+
+  // return the child vector's ordinal in the composite container
+  public abstract VectorWithOrdinal getChildVectorWithOrdinal(String name);
+}
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/AbstractMapVector.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/AbstractMapVector.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/AbstractMapVector.java
new file mode 100644
index 0000000..d4189b2
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/AbstractMapVector.java
@@ -0,0 +1,278 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex;
+
+import io.netty.buffer.ArrowBuf;
+
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.List;
+
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.vector.ValueVector;
+import org.apache.arrow.vector.types.MaterializedField;
+import org.apache.arrow.vector.types.Types.MajorType;
+import org.apache.arrow.vector.util.BasicTypeHelper;
+import org.apache.arrow.vector.util.CallBack;
+import org.apache.arrow.vector.util.MapWithOrdinal;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Lists;
+
+/*
+ * Base class for MapVectors. Currently used by RepeatedMapVector and MapVector
+ */
+public abstract class AbstractMapVector extends AbstractContainerVector {
+  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(AbstractContainerVector.class);
+
+  // Maintains a map with key as field name and value is the vector itself
+  private final MapWithOrdinal<String, ValueVector> vectors =  new MapWithOrdinal<>();
+
+  protected AbstractMapVector(MaterializedField field, BufferAllocator allocator, CallBack callBack) {
+    super(field.clone(), allocator, callBack);
+    MaterializedField clonedField = field.clone();
+    // create the hierarchy of the child vectors based on the materialized field
+    for (MaterializedField child : clonedField.getChildren()) {
+      if (!child.equals(BaseRepeatedValueVector.OFFSETS_FIELD)) {
+        final String fieldName = child.getLastName();
+        final ValueVector v = BasicTypeHelper.getNewVector(child, allocator, callBack);
+        putVector(fieldName, v);
+      }
+    }
+  }
+
+  @Override
+  public void close() {
+    for(final ValueVector valueVector : vectors.values()) {
+      valueVector.close();
+    }
+    vectors.clear();
+
+    super.close();
+  }
+
+  @Override
+  public boolean allocateNewSafe() {
+    /* boolean to keep track if all the memory allocation were successful
+     * Used in the case of composite vectors when we need to allocate multiple
+     * buffers for multiple vectors. If one of the allocations failed we need to
+     * clear all the memory that we allocated
+     */
+    boolean success = false;
+    try {
+      for (final ValueVector v : vectors.values()) {
+        if (!v.allocateNewSafe()) {
+          return false;
+        }
+      }
+      success = true;
+    } finally {
+      if (!success) {
+        clear();
+      }
+    }
+    return true;
+  }
+
+  /**
+   * Adds a new field with the given parameters or replaces the existing one and consequently returns the resultant
+   * {@link org.apache.arrow.vector.ValueVector}.
+   *
+   * Execution takes place in the following order:
+   * <ul>
+   *   <li>
+   *     if field is new, create and insert a new vector of desired type.
+   *   </li>
+   *   <li>
+   *     if field exists and existing vector is of desired vector type, return the vector.
+   *   </li>
+   *   <li>
+   *     if field exists and null filled, clear the existing vector; create and insert a new vector of desired type.
+   *   </li>
+   *   <li>
+   *     otherwise, throw an {@link java.lang.IllegalStateException}
+   *   </li>
+   * </ul>
+   *
+   * @param name name of the field
+   * @param type type of the field
+   * @param clazz class of expected vector type
+   * @param <T> class type of expected vector type
+   * @throws java.lang.IllegalStateException raised if there is a hard schema change
+   *
+   * @return resultant {@link org.apache.arrow.vector.ValueVector}
+   */
+  @Override
+  public <T extends ValueVector> T addOrGet(String name, MajorType type, Class<T> clazz) {
+    final ValueVector existing = getChild(name);
+    boolean create = false;
+    if (existing == null) {
+      create = true;
+    } else if (clazz.isAssignableFrom(existing.getClass())) {
+      return (T) existing;
+    } else if (nullFilled(existing)) {
+      existing.clear();
+      create = true;
+    }
+    if (create) {
+      final T vector = (T) BasicTypeHelper.getNewVector(name, allocator, type, callBack);
+      putChild(name, vector);
+      if (callBack!=null) {
+        callBack.doWork();
+      }
+      return vector;
+    }
+    final String message = "Drill does not support schema change yet. Existing[%s] and desired[%s] vector types mismatch";
+    throw new IllegalStateException(String.format(message, existing.getClass().getSimpleName(), clazz.getSimpleName()));
+  }
+
+  private boolean nullFilled(ValueVector vector) {
+    for (int r = 0; r < vector.getAccessor().getValueCount(); r++) {
+      if (!vector.getAccessor().isNull(r)) {
+        return false;
+      }
+    }
+    return true;
+  }
+
+  /**
+   * Returns a {@link org.apache.arrow.vector.ValueVector} corresponding to the given ordinal identifier.
+   */
+  public ValueVector getChildByOrdinal(int id) {
+    return vectors.getByOrdinal(id);
+  }
+
+  /**
+   * Returns a {@link org.apache.arrow.vector.ValueVector} instance of subtype of <T> corresponding to the given
+   * field name if exists or null.
+   */
+  @Override
+  public <T extends ValueVector> T getChild(String name, Class<T> clazz) {
+    final ValueVector v = vectors.get(name.toLowerCase());
+    if (v == null) {
+      return null;
+    }
+    return typeify(v, clazz);
+  }
+
+  /**
+   * Inserts the vector with the given name if it does not exist else replaces it with the new value.
+   *
+   * Note that this method does not enforce any vector type check nor throws a schema change exception.
+   */
+  protected void putChild(String name, ValueVector vector) {
+    putVector(name, vector);
+    field.addChild(vector.getField());
+  }
+
+  /**
+   * Inserts the input vector into the map if it does not exist, replaces if it exists already
+   * @param name  field name
+   * @param vector  vector to be inserted
+   */
+  protected void putVector(String name, ValueVector vector) {
+    final ValueVector old = vectors.put(
+        Preconditions.checkNotNull(name, "field name cannot be null").toLowerCase(),
+        Preconditions.checkNotNull(vector, "vector cannot be null")
+    );
+    if (old != null && old != vector) {
+      logger.debug("Field [{}] mutated from [{}] to [{}]", name, old.getClass().getSimpleName(),
+                   vector.getClass().getSimpleName());
+    }
+  }
+
+  /**
+   * Returns a sequence of underlying child vectors.
+   */
+  protected Collection<ValueVector> getChildren() {
+    return vectors.values();
+  }
+
+  /**
+   * Returns the number of underlying child vectors.
+   */
+  @Override
+  public int size() {
+    return vectors.size();
+  }
+
+  @Override
+  public Iterator<ValueVector> iterator() {
+    return vectors.values().iterator();
+  }
+
+  /**
+   * Returns a list of scalar child vectors recursing the entire vector hierarchy.
+   */
+  public List<ValueVector> getPrimitiveVectors() {
+    final List<ValueVector> primitiveVectors = Lists.newArrayList();
+    for (final ValueVector v : vectors.values()) {
+      if (v instanceof AbstractMapVector) {
+        AbstractMapVector mapVector = (AbstractMapVector) v;
+        primitiveVectors.addAll(mapVector.getPrimitiveVectors());
+      } else {
+        primitiveVectors.add(v);
+      }
+    }
+    return primitiveVectors;
+  }
+
+  /**
+   * Returns a vector with its corresponding ordinal mapping if field exists or null.
+   */
+  @Override
+  public VectorWithOrdinal getChildVectorWithOrdinal(String name) {
+    final int ordinal = vectors.getOrdinal(name.toLowerCase());
+    if (ordinal < 0) {
+      return null;
+    }
+    final ValueVector vector = vectors.getByOrdinal(ordinal);
+    return new VectorWithOrdinal(vector, ordinal);
+  }
+
+  @Override
+  public ArrowBuf[] getBuffers(boolean clear) {
+    final List<ArrowBuf> buffers = Lists.newArrayList();
+
+    for (final ValueVector vector : vectors.values()) {
+      for (final ArrowBuf buf : vector.getBuffers(false)) {
+        buffers.add(buf);
+        if (clear) {
+          buf.retain(1);
+        }
+      }
+      if (clear) {
+        vector.clear();
+      }
+    }
+
+    return buffers.toArray(new ArrowBuf[buffers.size()]);
+  }
+
+  @Override
+  public int getBufferSize() {
+    int actualBufSize = 0 ;
+
+    for (final ValueVector v : vectors.values()) {
+      for (final ArrowBuf buf : v.getBuffers(false)) {
+        actualBufSize += buf.writerIndex();
+      }
+    }
+    return actualBufSize;
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/BaseRepeatedValueVector.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/BaseRepeatedValueVector.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/BaseRepeatedValueVector.java
new file mode 100644
index 0000000..6518897
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/BaseRepeatedValueVector.java
@@ -0,0 +1,260 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex;
+
+import io.netty.buffer.ArrowBuf;
+
+import java.util.Collections;
+import java.util.Iterator;
+
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.vector.AddOrGetResult;
+import org.apache.arrow.vector.BaseValueVector;
+import org.apache.arrow.vector.UInt4Vector;
+import org.apache.arrow.vector.ValueVector;
+import org.apache.arrow.vector.VectorDescriptor;
+import org.apache.arrow.vector.ZeroVector;
+import org.apache.arrow.vector.types.MaterializedField;
+import org.apache.arrow.vector.types.Types.DataMode;
+import org.apache.arrow.vector.types.Types.MajorType;
+import org.apache.arrow.vector.types.Types.MinorType;
+import org.apache.arrow.vector.util.BasicTypeHelper;
+import org.apache.arrow.vector.util.SchemaChangeRuntimeException;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ObjectArrays;
+
+public abstract class BaseRepeatedValueVector extends BaseValueVector implements RepeatedValueVector {
+
+  public final static ValueVector DEFAULT_DATA_VECTOR = ZeroVector.INSTANCE;
+  public final static String OFFSETS_VECTOR_NAME = "$offsets$";
+  public final static String DATA_VECTOR_NAME = "$data$";
+
+  public final static MaterializedField OFFSETS_FIELD =
+    MaterializedField.create(OFFSETS_VECTOR_NAME, new MajorType(MinorType.UINT4, DataMode.REQUIRED));
+
+  protected final UInt4Vector offsets;
+  protected ValueVector vector;
+
+  protected BaseRepeatedValueVector(MaterializedField field, BufferAllocator allocator) {
+    this(field, allocator, DEFAULT_DATA_VECTOR);
+  }
+
+  protected BaseRepeatedValueVector(MaterializedField field, BufferAllocator allocator, ValueVector vector) {
+    super(field, allocator);
+    this.offsets = new UInt4Vector(OFFSETS_FIELD, allocator);
+    this.vector = Preconditions.checkNotNull(vector, "data vector cannot be null");
+  }
+
+  @Override
+  public boolean allocateNewSafe() {
+    /* boolean to keep track if all the memory allocation were successful
+     * Used in the case of composite vectors when we need to allocate multiple
+     * buffers for multiple vectors. If one of the allocations failed we need to
+     * clear all the memory that we allocated
+     */
+    boolean success = false;
+    try {
+      if (!offsets.allocateNewSafe()) {
+        return false;
+      }
+      success = vector.allocateNewSafe();
+    } finally {
+      if (!success) {
+        clear();
+      }
+    }
+    offsets.zeroVector();
+    return success;
+  }
+
+
+  @Override
+  public UInt4Vector getOffsetVector() {
+    return offsets;
+  }
+
+  @Override
+  public ValueVector getDataVector() {
+    return vector;
+  }
+
+  @Override
+  public void setInitialCapacity(int numRecords) {
+    offsets.setInitialCapacity(numRecords + 1);
+    vector.setInitialCapacity(numRecords * RepeatedValueVector.DEFAULT_REPEAT_PER_RECORD);
+  }
+
+  @Override
+  public int getValueCapacity() {
+    final int offsetValueCapacity = Math.max(offsets.getValueCapacity() - 1, 0);
+    if (vector == DEFAULT_DATA_VECTOR) {
+      return offsetValueCapacity;
+    }
+    return Math.min(vector.getValueCapacity(), offsetValueCapacity);
+  }
+
+//  @Override
+//  protected UserBitShared.SerializedField.Builder getMetadataBuilder() {
+//    return super.getMetadataBuilder()
+//        .addChild(offsets.getMetadata())
+//        .addChild(vector.getMetadata());
+//  }
+
+  @Override
+  public int getBufferSize() {
+    if (getAccessor().getValueCount() == 0) {
+      return 0;
+    }
+    return offsets.getBufferSize() + vector.getBufferSize();
+  }
+
+  @Override
+  public int getBufferSizeFor(int valueCount) {
+    if (valueCount == 0) {
+      return 0;
+    }
+
+    return offsets.getBufferSizeFor(valueCount + 1) + vector.getBufferSizeFor(valueCount);
+  }
+
+  @Override
+  public Iterator<ValueVector> iterator() {
+    return Collections.singleton(getDataVector()).iterator();
+  }
+
+  @Override
+  public void clear() {
+    offsets.clear();
+    vector.clear();
+    super.clear();
+  }
+
+  @Override
+  public ArrowBuf[] getBuffers(boolean clear) {
+    final ArrowBuf[] buffers = ObjectArrays.concat(offsets.getBuffers(false), vector.getBuffers(false), ArrowBuf.class);
+    if (clear) {
+      for (ArrowBuf buffer:buffers) {
+        buffer.retain();
+      }
+      clear();
+    }
+    return buffers;
+  }
+
+//  @Override
+//  public void load(UserBitShared.SerializedField metadata, DrillBuf buffer) {
+//    final UserBitShared.SerializedField offsetMetadata = metadata.getChild(0);
+//    offsets.load(offsetMetadata, buffer);
+//
+//    final UserBitShared.SerializedField vectorMetadata = metadata.getChild(1);
+//    if (getDataVector() == DEFAULT_DATA_VECTOR) {
+//      addOrGetVector(VectorDescriptor.create(vectorMetadata.getMajorType()));
+//    }
+//
+//    final int offsetLength = offsetMetadata.getBufferLength();
+//    final int vectorLength = vectorMetadata.getBufferLength();
+//    vector.load(vectorMetadata, buffer.slice(offsetLength, vectorLength));
+//  }
+
+  /**
+   * Returns 1 if inner vector is explicitly set via #addOrGetVector else 0
+   *
+   * @see {@link ContainerVectorLike#size}
+   */
+  @Override
+  public int size() {
+    return vector == DEFAULT_DATA_VECTOR ? 0:1;
+  }
+
+  @Override
+  public <T extends ValueVector> AddOrGetResult<T> addOrGetVector(VectorDescriptor descriptor) {
+    boolean created = false;
+    if (vector == DEFAULT_DATA_VECTOR && descriptor.getType().getMinorType() != MinorType.LATE) {
+      final MaterializedField field = descriptor.withName(DATA_VECTOR_NAME).getField();
+      vector = BasicTypeHelper.getNewVector(field, allocator);
+      // returned vector must have the same field
+      assert field.equals(vector.getField());
+      getField().addChild(field);
+      created = true;
+    }
+
+    final MajorType actual = vector.getField().getType();
+    if (!actual.equals(descriptor.getType())) {
+      final String msg = String.format("Inner vector type mismatch. Requested type: [%s], actual type: [%s]",
+          descriptor.getType(), actual);
+      throw new SchemaChangeRuntimeException(msg);
+    }
+
+    return new AddOrGetResult<>((T)vector, created);
+  }
+
+  protected void replaceDataVector(ValueVector v) {
+    vector.clear();
+    vector = v;
+  }
+
+  public abstract class BaseRepeatedAccessor extends BaseValueVector.BaseAccessor implements RepeatedAccessor {
+
+    @Override
+    public int getValueCount() {
+      return Math.max(offsets.getAccessor().getValueCount() - 1, 0);
+    }
+
+    @Override
+    public int getInnerValueCount() {
+      return vector.getAccessor().getValueCount();
+    }
+
+    @Override
+    public int getInnerValueCountAt(int index) {
+      return offsets.getAccessor().get(index+1) - offsets.getAccessor().get(index);
+    }
+
+    @Override
+    public boolean isNull(int index) {
+      return false;
+    }
+
+    @Override
+    public boolean isEmpty(int index) {
+      return false;
+    }
+  }
+
+  public abstract class BaseRepeatedMutator extends BaseValueVector.BaseMutator implements RepeatedMutator {
+
+    @Override
+    public void startNewValue(int index) {
+      while (offsets.getValueCapacity() <= index) {
+        offsets.reAlloc();
+      }
+      offsets.getMutator().setSafe(index+1, offsets.getAccessor().get(index));
+      setValueCount(index+1);
+    }
+
+    @Override
+    public void setValueCount(int valueCount) {
+      // TODO: populate offset end points
+      offsets.getMutator().setValueCount(valueCount == 0 ? 0 : valueCount+1);
+      final int childValueCount = valueCount == 0 ? 0 : offsets.getAccessor().get(valueCount);
+      vector.getMutator().setValueCount(childValueCount);
+    }
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/ContainerVectorLike.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/ContainerVectorLike.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/ContainerVectorLike.java
new file mode 100644
index 0000000..e50b0d0
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/ContainerVectorLike.java
@@ -0,0 +1,43 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex;
+
+import org.apache.arrow.vector.AddOrGetResult;
+import org.apache.arrow.vector.ValueVector;
+import org.apache.arrow.vector.VectorDescriptor;
+
+/**
+ * A mix-in used for introducing container vector-like behaviour.
+ */
+public interface ContainerVectorLike {
+
+  /**
+   * Creates and adds a child vector if none with the same name exists, else returns the vector instance.
+   *
+   * @param  descriptor vector descriptor
+   * @return  result of operation wrapping vector corresponding to the given descriptor and whether it's newly created
+   * @throws org.apache.drill.common.exceptions.DrillRuntimeException
+   *    if schema change is not permissible between the given and existing data vector types.
+   */
+  <T extends ValueVector> AddOrGetResult<T> addOrGetVector(VectorDescriptor descriptor);
+
+  /**
+   * Returns the number of child vectors in this container vector-like instance.
+   */
+  int size();
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/EmptyValuePopulator.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/EmptyValuePopulator.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/EmptyValuePopulator.java
new file mode 100644
index 0000000..df69975
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/EmptyValuePopulator.java
@@ -0,0 +1,54 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex;
+
+import org.apache.arrow.vector.UInt4Vector;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * A helper class that is used to track and populate empty values in repeated value vectors.
+ */
+public class EmptyValuePopulator {
+  private final UInt4Vector offsets;
+
+  public EmptyValuePopulator(UInt4Vector offsets) {
+    this.offsets = Preconditions.checkNotNull(offsets, "offsets cannot be null");
+  }
+
+  /**
+   * Marks all values since the last set as empty. The last set value is obtained from underlying offsets vector.
+   *
+   * @param lastIndex  the last index (inclusive) in the offsets vector until which empty population takes place
+   * @throws java.lang.IndexOutOfBoundsException  if lastIndex is negative or greater than offsets capacity.
+   */
+  public void populate(int lastIndex) {
+    if (lastIndex < 0) {
+      throw new IndexOutOfBoundsException("index cannot be negative");
+    }
+    final UInt4Vector.Accessor accessor = offsets.getAccessor();
+    final UInt4Vector.Mutator mutator = offsets.getMutator();
+    final int lastSet = Math.max(accessor.getValueCount() - 1, 0);
+    final int previousEnd = accessor.get(lastSet);//0 ? 0 : accessor.get(lastSet);
+    for (int i = lastSet; i < lastIndex; i++) {
+      mutator.setSafe(i + 1, previousEnd);
+    }
+    mutator.setValueCount(lastIndex+1);
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/ListVector.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/ListVector.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/ListVector.java
new file mode 100644
index 0000000..8387c9e
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/ListVector.java
@@ -0,0 +1,321 @@
+/*******************************************************************************
+
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ ******************************************************************************/
+package org.apache.arrow.vector.complex;
+
+import io.netty.buffer.ArrowBuf;
+
+import java.util.List;
+
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.memory.OutOfMemoryException;
+import org.apache.arrow.vector.AddOrGetResult;
+import org.apache.arrow.vector.UInt1Vector;
+import org.apache.arrow.vector.UInt4Vector;
+import org.apache.arrow.vector.ValueVector;
+import org.apache.arrow.vector.VectorDescriptor;
+import org.apache.arrow.vector.ZeroVector;
+import org.apache.arrow.vector.complex.impl.ComplexCopier;
+import org.apache.arrow.vector.complex.impl.UnionListReader;
+import org.apache.arrow.vector.complex.impl.UnionListWriter;
+import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.complex.writer.FieldWriter;
+import org.apache.arrow.vector.types.MaterializedField;
+import org.apache.arrow.vector.types.Types.DataMode;
+import org.apache.arrow.vector.types.Types.MajorType;
+import org.apache.arrow.vector.types.Types.MinorType;
+import org.apache.arrow.vector.util.CallBack;
+import org.apache.arrow.vector.util.JsonStringArrayList;
+import org.apache.arrow.vector.util.TransferPair;
+
+import com.google.common.collect.ObjectArrays;
+
+public class ListVector extends BaseRepeatedValueVector {
+
+  private UInt4Vector offsets;
+  private final UInt1Vector bits;
+  private Mutator mutator = new Mutator();
+  private Accessor accessor = new Accessor();
+  private UnionListWriter writer;
+  private UnionListReader reader;
+  private CallBack callBack;
+
+  public ListVector(MaterializedField field, BufferAllocator allocator, CallBack callBack) {
+    super(field, allocator);
+    this.bits = new UInt1Vector(MaterializedField.create("$bits$", new MajorType(MinorType.UINT1, DataMode.REQUIRED)), allocator);
+    offsets = getOffsetVector();
+    this.field.addChild(getDataVector().getField());
+    this.writer = new UnionListWriter(this);
+    this.reader = new UnionListReader(this);
+    this.callBack = callBack;
+  }
+
+  public UnionListWriter getWriter() {
+    return writer;
+  }
+
+  @Override
+  public void allocateNew() throws OutOfMemoryException {
+    super.allocateNewSafe();
+  }
+
+  public void transferTo(ListVector target) {
+    offsets.makeTransferPair(target.offsets).transfer();
+    bits.makeTransferPair(target.bits).transfer();
+    if (target.getDataVector() instanceof ZeroVector) {
+      target.addOrGetVector(new VectorDescriptor(vector.getField().getType()));
+    }
+    getDataVector().makeTransferPair(target.getDataVector()).transfer();
+  }
+
+  public void copyFromSafe(int inIndex, int outIndex, ListVector from) {
+    copyFrom(inIndex, outIndex, from);
+  }
+
+  public void copyFrom(int inIndex, int outIndex, ListVector from) {
+    FieldReader in = from.getReader();
+    in.setPosition(inIndex);
+    FieldWriter out = getWriter();
+    out.setPosition(outIndex);
+    ComplexCopier.copy(in, out);
+  }
+
+  @Override
+  public ValueVector getDataVector() {
+    return vector;
+  }
+
+  @Override
+  public TransferPair getTransferPair(String ref, BufferAllocator allocator) {
+    return new TransferImpl(field.withPath(ref), allocator);
+  }
+
+  @Override
+  public TransferPair makeTransferPair(ValueVector target) {
+    return new TransferImpl((ListVector) target);
+  }
+
+  private class TransferImpl implements TransferPair {
+
+    ListVector to;
+
+    public TransferImpl(MaterializedField field, BufferAllocator allocator) {
+      to = new ListVector(field, allocator, null);
+      to.addOrGetVector(new VectorDescriptor(vector.getField().getType()));
+    }
+
+    public TransferImpl(ListVector to) {
+      this.to = to;
+      to.addOrGetVector(new VectorDescriptor(vector.getField().getType()));
+    }
+
+    @Override
+    public void transfer() {
+      transferTo(to);
+    }
+
+    @Override
+    public void splitAndTransfer(int startIndex, int length) {
+      to.allocateNew();
+      for (int i = 0; i < length; i++) {
+        copyValueSafe(startIndex + i, i);
+      }
+    }
+
+    @Override
+    public ValueVector getTo() {
+      return to;
+    }
+
+    @Override
+    public void copyValueSafe(int from, int to) {
+      this.to.copyFrom(from, to, ListVector.this);
+    }
+  }
+
+  @Override
+  public Accessor getAccessor() {
+    return accessor;
+  }
+
+  @Override
+  public Mutator getMutator() {
+    return mutator;
+  }
+
+  @Override
+  public FieldReader getReader() {
+    return reader;
+  }
+
+  @Override
+  public boolean allocateNewSafe() {
+    /* boolean to keep track if all the memory allocation were successful
+     * Used in the case of composite vectors when we need to allocate multiple
+     * buffers for multiple vectors. If one of the allocations failed we need to
+     * clear all the memory that we allocated
+     */
+    boolean success = false;
+    try {
+      if (!offsets.allocateNewSafe()) {
+        return false;
+      }
+      success = vector.allocateNewSafe();
+      success = success && bits.allocateNewSafe();
+    } finally {
+      if (!success) {
+        clear();
+      }
+    }
+    if (success) {
+      offsets.zeroVector();
+      bits.zeroVector();
+    }
+    return success;
+  }
+
+//  @Override
+//  protected UserBitShared.SerializedField.Builder getMetadataBuilder() {
+//    return getField().getAsBuilder()
+//            .setValueCount(getAccessor().getValueCount())
+//            .setBufferLength(getBufferSize())
+//            .addChild(offsets.getMetadata())
+//            .addChild(bits.getMetadata())
+//            .addChild(vector.getMetadata());
+//  }
+  public <T extends ValueVector> AddOrGetResult<T> addOrGetVector(VectorDescriptor descriptor) {
+    AddOrGetResult<T> result = super.addOrGetVector(descriptor);
+    reader = new UnionListReader(this);
+    return result;
+  }
+
+  @Override
+  public int getBufferSize() {
+    if (getAccessor().getValueCount() == 0) {
+      return 0;
+    }
+    return offsets.getBufferSize() + bits.getBufferSize() + vector.getBufferSize();
+  }
+
+  @Override
+  public void clear() {
+    offsets.clear();
+    vector.clear();
+    bits.clear();
+    lastSet = 0;
+    super.clear();
+  }
+
+  @Override
+  public ArrowBuf[] getBuffers(boolean clear) {
+    final ArrowBuf[] buffers = ObjectArrays.concat(offsets.getBuffers(false), ObjectArrays.concat(bits.getBuffers(false),
+            vector.getBuffers(false), ArrowBuf.class), ArrowBuf.class);
+    if (clear) {
+      for (ArrowBuf buffer:buffers) {
+        buffer.retain();
+      }
+      clear();
+    }
+    return buffers;
+  }
+
+//  @Override
+//  public void load(UserBitShared.SerializedField metadata, DrillBuf buffer) {
+//    final UserBitShared.SerializedField offsetMetadata = metadata.getChild(0);
+//    offsets.load(offsetMetadata, buffer);
+//
+//    final int offsetLength = offsetMetadata.getBufferLength();
+//    final UserBitShared.SerializedField bitMetadata = metadata.getChild(1);
+//    final int bitLength = bitMetadata.getBufferLength();
+//    bits.load(bitMetadata, buffer.slice(offsetLength, bitLength));
+//
+//    final UserBitShared.SerializedField vectorMetadata = metadata.getChild(2);
+//    if (getDataVector() == DEFAULT_DATA_VECTOR) {
+//      addOrGetVector(VectorDescriptor.create(vectorMetadata.getMajorType()));
+//    }
+//
+//    final int vectorLength = vectorMetadata.getBufferLength();
+//    vector.load(vectorMetadata, buffer.slice(offsetLength + bitLength, vectorLength));
+//  }
+
+  public UnionVector promoteToUnion() {
+    MaterializedField newField = MaterializedField.create(getField().getPath(), new MajorType(MinorType.UNION, DataMode.OPTIONAL));
+    UnionVector vector = new UnionVector(newField, allocator, null);
+    replaceDataVector(vector);
+    reader = new UnionListReader(this);
+    return vector;
+  }
+
+  private int lastSet;
+
+  public class Accessor extends BaseRepeatedAccessor {
+
+    @Override
+    public Object getObject(int index) {
+      if (isNull(index)) {
+        return null;
+      }
+      final List<Object> vals = new JsonStringArrayList<>();
+      final UInt4Vector.Accessor offsetsAccessor = offsets.getAccessor();
+      final int start = offsetsAccessor.get(index);
+      final int end = offsetsAccessor.get(index + 1);
+      final ValueVector.Accessor valuesAccessor = getDataVector().getAccessor();
+      for(int i = start; i < end; i++) {
+        vals.add(valuesAccessor.getObject(i));
+      }
+      return vals;
+    }
+
+    @Override
+    public boolean isNull(int index) {
+      return bits.getAccessor().get(index) == 0;
+    }
+  }
+
+  public class Mutator extends BaseRepeatedMutator {
+    public void setNotNull(int index) {
+      bits.getMutator().setSafe(index, 1);
+      lastSet = index + 1;
+    }
+
+    @Override
+    public void startNewValue(int index) {
+      for (int i = lastSet; i <= index; i++) {
+        offsets.getMutator().setSafe(i + 1, offsets.getAccessor().get(i));
+      }
+      setNotNull(index);
+      lastSet = index + 1;
+    }
+
+    @Override
+    public void setValueCount(int valueCount) {
+      // TODO: populate offset end points
+      if (valueCount == 0) {
+        offsets.getMutator().setValueCount(0);
+      } else {
+        for (int i = lastSet; i < valueCount; i++) {
+          offsets.getMutator().setSafe(i + 1, offsets.getAccessor().get(i));
+        }
+        offsets.getMutator().setValueCount(valueCount + 1);
+      }
+      final int childValueCount = valueCount == 0 ? 0 : offsets.getAccessor().get(valueCount);
+      vector.getMutator().setValueCount(childValueCount);
+      bits.getMutator().setValueCount(valueCount);
+    }
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/MapVector.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/MapVector.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/MapVector.java
new file mode 100644
index 0000000..1bbce73
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/MapVector.java
@@ -0,0 +1,374 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex;
+
+import io.netty.buffer.ArrowBuf;
+
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.Map;
+
+import javax.annotation.Nullable;
+
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.vector.BaseValueVector;
+import org.apache.arrow.vector.ValueVector;
+import org.apache.arrow.vector.complex.RepeatedMapVector.MapSingleCopier;
+import org.apache.arrow.vector.complex.impl.SingleMapReaderImpl;
+import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.holders.ComplexHolder;
+import org.apache.arrow.vector.types.MaterializedField;
+import org.apache.arrow.vector.types.Types.DataMode;
+import org.apache.arrow.vector.types.Types.MajorType;
+import org.apache.arrow.vector.types.Types.MinorType;
+import org.apache.arrow.vector.util.CallBack;
+import org.apache.arrow.vector.util.JsonStringHashMap;
+import org.apache.arrow.vector.util.TransferPair;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Ordering;
+import com.google.common.primitives.Ints;
+
+public class MapVector extends AbstractMapVector {
+  //private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(MapVector.class);
+
+  public final static MajorType TYPE = new MajorType(MinorType.MAP, DataMode.OPTIONAL);
+
+  private final SingleMapReaderImpl reader = new SingleMapReaderImpl(MapVector.this);
+  private final Accessor accessor = new Accessor();
+  private final Mutator mutator = new Mutator();
+  private int valueCount;
+
+  public MapVector(String path, BufferAllocator allocator, CallBack callBack){
+    this(MaterializedField.create(path, TYPE), allocator, callBack);
+  }
+
+  public MapVector(MaterializedField field, BufferAllocator allocator, CallBack callBack){
+    super(field, allocator, callBack);
+  }
+
+  @Override
+  public FieldReader getReader() {
+    //return new SingleMapReaderImpl(MapVector.this);
+    return reader;
+  }
+
+  transient private MapTransferPair ephPair;
+  transient private MapSingleCopier ephPair2;
+
+  public void copyFromSafe(int fromIndex, int thisIndex, MapVector from) {
+    if(ephPair == null || ephPair.from != from) {
+      ephPair = (MapTransferPair) from.makeTransferPair(this);
+    }
+    ephPair.copyValueSafe(fromIndex, thisIndex);
+  }
+
+  public void copyFromSafe(int fromSubIndex, int thisIndex, RepeatedMapVector from) {
+    if(ephPair2 == null || ephPair2.from != from) {
+      ephPair2 = from.makeSingularCopier(this);
+    }
+    ephPair2.copySafe(fromSubIndex, thisIndex);
+  }
+
+  @Override
+  protected boolean supportsDirectRead() {
+    return true;
+  }
+
+  public Iterator<String> fieldNameIterator() {
+    return getChildFieldNames().iterator();
+  }
+
+  @Override
+  public void setInitialCapacity(int numRecords) {
+    for (final ValueVector v : (Iterable<ValueVector>) this) {
+      v.setInitialCapacity(numRecords);
+    }
+  }
+
+  @Override
+  public int getBufferSize() {
+    if (valueCount == 0 || size() == 0) {
+      return 0;
+    }
+    long buffer = 0;
+    for (final ValueVector v : (Iterable<ValueVector>)this) {
+      buffer += v.getBufferSize();
+    }
+
+    return (int) buffer;
+  }
+
+  @Override
+  public int getBufferSizeFor(final int valueCount) {
+    if (valueCount == 0) {
+      return 0;
+    }
+
+    long bufferSize = 0;
+    for (final ValueVector v : (Iterable<ValueVector>) this) {
+      bufferSize += v.getBufferSizeFor(valueCount);
+    }
+
+    return (int) bufferSize;
+  }
+
+  @Override
+  public ArrowBuf[] getBuffers(boolean clear) {
+    int expectedSize = getBufferSize();
+    int actualSize   = super.getBufferSize();
+
+    Preconditions.checkArgument(expectedSize == actualSize);
+    return super.getBuffers(clear);
+  }
+
+  @Override
+  public TransferPair getTransferPair(BufferAllocator allocator) {
+    return new MapTransferPair(this, getField().getPath(), allocator);
+  }
+
+  @Override
+  public TransferPair makeTransferPair(ValueVector to) {
+    return new MapTransferPair(this, (MapVector) to);
+  }
+
+  @Override
+  public TransferPair getTransferPair(String ref, BufferAllocator allocator) {
+    return new MapTransferPair(this, ref, allocator);
+  }
+
+  protected static class MapTransferPair implements TransferPair{
+    private final TransferPair[] pairs;
+    private final MapVector from;
+    private final MapVector to;
+
+    public MapTransferPair(MapVector from, String path, BufferAllocator allocator) {
+      this(from, new MapVector(MaterializedField.create(path, TYPE), allocator, from.callBack), false);
+    }
+
+    public MapTransferPair(MapVector from, MapVector to) {
+      this(from, to, true);
+    }
+
+    protected MapTransferPair(MapVector from, MapVector to, boolean allocate) {
+      this.from = from;
+      this.to = to;
+      this.pairs = new TransferPair[from.size()];
+      this.to.ephPair = null;
+      this.to.ephPair2 = null;
+
+      int i = 0;
+      ValueVector vector;
+      for (String child:from.getChildFieldNames()) {
+        int preSize = to.size();
+        vector = from.getChild(child);
+        if (vector == null) {
+          continue;
+        }
+        //DRILL-1872: we add the child fields for the vector, looking up the field by name. For a map vector,
+        // the child fields may be nested fields of the top level child. For example if the structure
+        // of a child field is oa.oab.oabc then we add oa, then add oab to oa then oabc to oab.
+        // But the children member of a Materialized field is a HashSet. If the fields are added in the
+        // children HashSet, and the hashCode of the Materialized field includes the hash code of the
+        // children, the hashCode value of oa changes *after* the field has been added to the HashSet.
+        // (This is similar to what happens in ScanBatch where the children cannot be added till they are
+        // read). To take care of this, we ensure that the hashCode of the MaterializedField does not
+        // include the hashCode of the children but is based only on MaterializedField$key.
+        final ValueVector newVector = to.addOrGet(child, vector.getField().getType(), vector.getClass());
+        if (allocate && to.size() != preSize) {
+          newVector.allocateNew();
+        }
+        pairs[i++] = vector.makeTransferPair(newVector);
+      }
+    }
+
+    @Override
+    public void transfer() {
+      for (final TransferPair p : pairs) {
+        p.transfer();
+      }
+      to.valueCount = from.valueCount;
+      from.clear();
+    }
+
+    @Override
+    public ValueVector getTo() {
+      return to;
+    }
+
+    @Override
+    public void copyValueSafe(int from, int to) {
+      for (TransferPair p : pairs) {
+        p.copyValueSafe(from, to);
+      }
+    }
+
+    @Override
+    public void splitAndTransfer(int startIndex, int length) {
+      for (TransferPair p : pairs) {
+        p.splitAndTransfer(startIndex, length);
+      }
+      to.getMutator().setValueCount(length);
+    }
+  }
+
+  @Override
+  public int getValueCapacity() {
+    if (size() == 0) {
+      return 0;
+    }
+
+    final Ordering<ValueVector> natural = new Ordering<ValueVector>() {
+      @Override
+      public int compare(@Nullable ValueVector left, @Nullable ValueVector right) {
+        return Ints.compare(
+            Preconditions.checkNotNull(left).getValueCapacity(),
+            Preconditions.checkNotNull(right).getValueCapacity()
+        );
+      }
+    };
+
+    return natural.min(getChildren()).getValueCapacity();
+  }
+
+  @Override
+  public Accessor getAccessor() {
+    return accessor;
+  }
+
+//  @Override
+//  public void load(SerializedField metadata, DrillBuf buf) {
+//    final List<SerializedField> fields = metadata.getChildList();
+//    valueCount = metadata.getValueCount();
+//
+//    int bufOffset = 0;
+//    for (final SerializedField child : fields) {
+//      final MaterializedField fieldDef = SerializedFieldHelper.create(child);
+//
+//      ValueVector vector = getChild(fieldDef.getLastName());
+//      if (vector == null) {
+//         if we arrive here, we didn't have a matching vector.
+//        vector = BasicTypeHelper.getNewVector(fieldDef, allocator);
+//        putChild(fieldDef.getLastName(), vector);
+//      }
+//      if (child.getValueCount() == 0) {
+//        vector.clear();
+//      } else {
+//        vector.load(child, buf.slice(bufOffset, child.getBufferLength()));
+//      }
+//      bufOffset += child.getBufferLength();
+//    }
+//
+//    assert bufOffset == buf.capacity();
+//  }
+//
+//  @Override
+//  public SerializedField getMetadata() {
+//    SerializedField.Builder b = getField() //
+//        .getAsBuilder() //
+//        .setBufferLength(getBufferSize()) //
+//        .setValueCount(valueCount);
+//
+//
+//    for(ValueVector v : getChildren()) {
+//      b.addChild(v.getMetadata());
+//    }
+//    return b.build();
+//  }
+
+  @Override
+  public Mutator getMutator() {
+    return mutator;
+  }
+
+  public class Accessor extends BaseValueVector.BaseAccessor {
+
+    @Override
+    public Object getObject(int index) {
+      Map<String, Object> vv = new JsonStringHashMap<>();
+      for (String child:getChildFieldNames()) {
+        ValueVector v = getChild(child);
+        // TODO(DRILL-4001):  Resolve this hack:
+        // The index/value count check in the following if statement is a hack
+        // to work around the current fact that RecordBatchLoader.load and
+        // MapVector.load leave child vectors with a length of zero (as opposed
+        // to matching the lengths of siblings and the parent map vector)
+        // because they don't remove (or set the lengths of) vectors from
+        // previous batches that aren't in the current batch.
+        if (v != null && index < v.getAccessor().getValueCount()) {
+          Object value = v.getAccessor().getObject(index);
+          if (value != null) {
+            vv.put(child, value);
+          }
+        }
+      }
+      return vv;
+    }
+
+    public void get(int index, ComplexHolder holder) {
+      reader.setPosition(index);
+      holder.reader = reader;
+    }
+
+    @Override
+    public int getValueCount() {
+      return valueCount;
+    }
+  }
+
+  public ValueVector getVectorById(int id) {
+    return getChildByOrdinal(id);
+  }
+
+  public class Mutator extends BaseValueVector.BaseMutator {
+
+    @Override
+    public void setValueCount(int valueCount) {
+      for (final ValueVector v : getChildren()) {
+        v.getMutator().setValueCount(valueCount);
+      }
+      MapVector.this.valueCount = valueCount;
+    }
+
+    @Override
+    public void reset() { }
+
+    @Override
+    public void generateTestData(int values) { }
+  }
+
+  @Override
+  public void clear() {
+    for (final ValueVector v : getChildren()) {
+      v.clear();
+    }
+    valueCount = 0;
+  }
+
+  @Override
+  public void close() {
+    final Collection<ValueVector> vectors = getChildren();
+    for (final ValueVector v : vectors) {
+      v.close();
+    }
+    vectors.clear();
+    valueCount = 0;
+
+    super.close();
+ }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/Positionable.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/Positionable.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/Positionable.java
new file mode 100644
index 0000000..9345118
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/Positionable.java
@@ -0,0 +1,22 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex;
+
+public interface Positionable {
+  public void setPosition(int index);
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedFixedWidthVectorLike.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedFixedWidthVectorLike.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedFixedWidthVectorLike.java
new file mode 100644
index 0000000..23850bc
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedFixedWidthVectorLike.java
@@ -0,0 +1,40 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex;
+
+/**
+ * A {@link org.apache.arrow.vector.ValueVector} mix-in that can be used in conjunction with
+ * {@link RepeatedValueVector} subtypes.
+ */
+public interface RepeatedFixedWidthVectorLike {
+  /**
+   * Allocate a new memory space for this vector.  Must be called prior to using the ValueVector.
+   *
+   * @param valueCount   Number of separate repeating groupings.
+   * @param innerValueCount   Number of supported values in the vector.
+   */
+  void allocateNew(int valueCount, int innerValueCount);
+
+  /**
+   * Load the records in the provided buffer based on the given number of values.
+   * @param valueCount   Number of separate repeating groupings.
+   * @param innerValueCount Number atomic values the buffer contains.
+   * @param buf Incoming buffer.
+   * @return The number of bytes of the buffer that were consumed.
+   */
+}


[14/17] arrow git commit: ARROW-4: This provides an partial C++11 implementation of the Apache Arrow data structures along with a cmake-based build system. The codebase generally follows Google C++ style guide, but more cleaning to be more conforming is

Posted by ja...@apache.org.
http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/list-test.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/list-test.cc b/cpp/src/arrow/types/list-test.cc
new file mode 100644
index 0000000..47673ff
--- /dev/null
+++ b/cpp/src/arrow/types/list-test.cc
@@ -0,0 +1,166 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <gtest/gtest.h>
+#include <cstdlib>
+#include <cstdint>
+#include <memory>
+#include <string>
+#include <vector>
+
+#include "arrow/array.h"
+#include "arrow/test-util.h"
+#include "arrow/type.h"
+#include "arrow/types/construct.h"
+#include "arrow/types/integer.h"
+#include "arrow/types/list.h"
+#include "arrow/types/string.h"
+#include "arrow/types/test-common.h"
+#include "arrow/util/status.h"
+
+using std::string;
+using std::unique_ptr;
+using std::vector;
+
+namespace arrow {
+
+class ArrayBuilder;
+
+TEST(TypesTest, TestListType) {
+  std::shared_ptr<DataType> vt = std::make_shared<UInt8Type>();
+
+  ListType list_type(vt);
+  ListType list_type_nn(vt, false);
+
+  ASSERT_EQ(list_type.type, TypeEnum::LIST);
+  ASSERT_TRUE(list_type.nullable);
+  ASSERT_FALSE(list_type_nn.nullable);
+
+  ASSERT_EQ(list_type.name(), string("list"));
+  ASSERT_EQ(list_type.ToString(), string("list<uint8>"));
+
+  ASSERT_EQ(list_type.value_type->type, vt->type);
+  ASSERT_EQ(list_type.value_type->type, vt->type);
+
+  std::shared_ptr<DataType> st = std::make_shared<StringType>();
+  std::shared_ptr<DataType> lt = std::make_shared<ListType>(st);
+  ASSERT_EQ(lt->ToString(), string("list<string>"));
+
+  ListType lt2(lt);
+  ASSERT_EQ(lt2.ToString(), string("list<list<string>>"));
+}
+
+// ----------------------------------------------------------------------
+// List tests
+
+class TestListBuilder : public TestBuilder {
+ public:
+  void SetUp() {
+    TestBuilder::SetUp();
+
+    value_type_ = TypePtr(new Int32Type());
+    type_ = TypePtr(new ListType(value_type_));
+
+    ArrayBuilder* tmp;
+    ASSERT_OK(make_builder(type_, &tmp));
+    builder_.reset(static_cast<ListBuilder*>(tmp));
+  }
+
+  void Done() {
+    Array* out;
+    ASSERT_OK(builder_->ToArray(&out));
+    result_.reset(static_cast<ListArray*>(out));
+  }
+
+ protected:
+  TypePtr value_type_;
+  TypePtr type_;
+
+  unique_ptr<ListBuilder> builder_;
+  unique_ptr<ListArray> result_;
+};
+
+
+TEST_F(TestListBuilder, TestResize) {
+}
+
+TEST_F(TestListBuilder, TestAppendNull) {
+  ASSERT_OK(builder_->AppendNull());
+  ASSERT_OK(builder_->AppendNull());
+
+  Done();
+
+  ASSERT_TRUE(result_->IsNull(0));
+  ASSERT_TRUE(result_->IsNull(1));
+
+  ASSERT_EQ(0, result_->offsets()[0]);
+  ASSERT_EQ(0, result_->offset(1));
+  ASSERT_EQ(0, result_->offset(2));
+
+  Int32Array* values = static_cast<Int32Array*>(result_->values().get());
+  ASSERT_EQ(0, values->length());
+}
+
+TEST_F(TestListBuilder, TestBasics) {
+  vector<int32_t> values = {0, 1, 2, 3, 4, 5, 6};
+  vector<int> lengths = {3, 0, 4};
+  vector<uint8_t> is_null = {0, 1, 0};
+
+  Int32Builder* vb = static_cast<Int32Builder*>(builder_->value_builder());
+
+  int pos = 0;
+  for (size_t i = 0; i < lengths.size(); ++i) {
+    ASSERT_OK(builder_->Append(is_null[i] > 0));
+    for (int j = 0; j < lengths[i]; ++j) {
+      ASSERT_OK(vb->Append(values[pos++]));
+    }
+  }
+
+  Done();
+
+  ASSERT_TRUE(result_->nullable());
+  ASSERT_TRUE(result_->values()->nullable());
+
+  ASSERT_EQ(3, result_->length());
+  vector<int32_t> ex_offsets = {0, 3, 3, 7};
+  for (size_t i = 0; i < ex_offsets.size(); ++i) {
+    ASSERT_EQ(ex_offsets[i], result_->offset(i));
+  }
+
+  for (int i = 0; i < result_->length(); ++i) {
+    ASSERT_EQ(static_cast<bool>(is_null[i]), result_->IsNull(i));
+  }
+
+  ASSERT_EQ(7, result_->values()->length());
+  Int32Array* varr = static_cast<Int32Array*>(result_->values().get());
+
+  for (size_t i = 0; i < values.size(); ++i) {
+    ASSERT_EQ(values[i], varr->Value(i));
+  }
+}
+
+TEST_F(TestListBuilder, TestBasicsNonNullable) {
+}
+
+
+TEST_F(TestListBuilder, TestZeroLength) {
+  // All buffers are null
+  Done();
+}
+
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/list.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/list.cc b/cpp/src/arrow/types/list.cc
new file mode 100644
index 0000000..f0ff5bf
--- /dev/null
+++ b/cpp/src/arrow/types/list.cc
@@ -0,0 +1,31 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/types/list.h"
+
+#include <sstream>
+#include <string>
+
+namespace arrow {
+
+std::string ListType::ToString() const {
+  std::stringstream s;
+  s << "list<" << value_type->ToString() << ">";
+  return s.str();
+}
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/list.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/list.h b/cpp/src/arrow/types/list.h
new file mode 100644
index 0000000..0f11162
--- /dev/null
+++ b/cpp/src/arrow/types/list.h
@@ -0,0 +1,206 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_TYPES_LIST_H
+#define ARROW_TYPES_LIST_H
+
+#include <cstdint>
+#include <cstring>
+#include <memory>
+#include <string>
+
+#include "arrow/array.h"
+#include "arrow/builder.h"
+#include "arrow/type.h"
+#include "arrow/types/integer.h"
+#include "arrow/types/primitive.h"
+#include "arrow/util/bit-util.h"
+#include "arrow/util/buffer.h"
+#include "arrow/util/status.h"
+
+namespace arrow {
+
+struct ListType : public DataType {
+  // List can contain any other logical value type
+  TypePtr value_type;
+
+  explicit ListType(const TypePtr& value_type, bool nullable = true)
+      : DataType(TypeEnum::LIST, nullable),
+        value_type(value_type) {}
+
+  static char const *name() {
+    return "list";
+  }
+
+  virtual std::string ToString() const;
+};
+
+
+class ListArray : public Array {
+ public:
+  ListArray() : Array(), offset_buf_(nullptr), offsets_(nullptr) {}
+
+  ListArray(const TypePtr& type, int64_t length, std::shared_ptr<Buffer> offsets,
+      const ArrayPtr& values, std::shared_ptr<Buffer> nulls = nullptr) {
+    Init(type, length, offsets, values, nulls);
+  }
+
+  virtual ~ListArray() {}
+
+  void Init(const TypePtr& type, int64_t length, std::shared_ptr<Buffer> offsets,
+      const ArrayPtr& values, std::shared_ptr<Buffer> nulls = nullptr) {
+    offset_buf_ = offsets;
+    offsets_ = offsets == nullptr? nullptr :
+      reinterpret_cast<const int32_t*>(offset_buf_->data());
+
+    values_ = values;
+    Array::Init(type, length, nulls);
+  }
+
+  // Return a shared pointer in case the requestor desires to share ownership
+  // with this array.
+  const ArrayPtr& values() const {return values_;}
+
+  const int32_t* offsets() const { return offsets_;}
+
+  int32_t offset(int i) const { return offsets_[i];}
+
+  // Neither of these functions will perform boundschecking
+  int32_t value_offset(int i) { return offsets_[i];}
+  int32_t value_length(int i) { return offsets_[i + 1] - offsets_[i];}
+
+ protected:
+  std::shared_ptr<Buffer> offset_buf_;
+  const int32_t* offsets_;
+  ArrayPtr values_;
+};
+
+// ----------------------------------------------------------------------
+// Array builder
+
+
+// Builder class for variable-length list array value types
+//
+// To use this class, you must append values to the child array builder and use
+// the Append function to delimit each distinct list value (once the values
+// have been appended to the child array)
+class ListBuilder : public Int32Builder {
+ public:
+  ListBuilder(const TypePtr& type, ArrayBuilder* value_builder)
+      : Int32Builder(type) {
+    value_builder_.reset(value_builder);
+  }
+
+  Status Init(int64_t elements) {
+    // One more than requested.
+    //
+    // XXX: This is slightly imprecise, because we might trigger null mask
+    // resizes that are unnecessary when creating arrays with power-of-two size
+    return Int32Builder::Init(elements + 1);
+  }
+
+  Status Resize(int64_t capacity) {
+    // Need space for the end offset
+    RETURN_NOT_OK(Int32Builder::Resize(capacity + 1));
+
+    // Slight hack, as the "real" capacity is one less
+    --capacity_;
+    return Status::OK();
+  }
+
+  // Vector append
+  //
+  // If passed, null_bytes is of equal length to values, and any nonzero byte
+  // will be considered as a null for that slot
+  Status Append(T* values, int64_t length, uint8_t* null_bytes = nullptr) {
+    if (length_ + length > capacity_) {
+      int64_t new_capacity = util::next_power2(length_ + length);
+      RETURN_NOT_OK(Resize(new_capacity));
+    }
+    memcpy(raw_buffer() + length_, values, length * elsize_);
+
+    if (nullable_ && null_bytes != nullptr) {
+      // If null_bytes is all not null, then none of the values are null
+      for (int i = 0; i < length; ++i) {
+        util::set_bit(null_bits_, length_ + i, static_cast<bool>(null_bytes[i]));
+      }
+    }
+
+    length_ += length;
+    return Status::OK();
+  }
+
+  // Initialize an array type instance with the results of this builder
+  // Transfers ownership of all buffers
+  template <typename Container>
+  Status Transfer(Container* out) {
+    Array* child_values;
+    RETURN_NOT_OK(value_builder_->ToArray(&child_values));
+
+    // Add final offset if the length is non-zero
+    if (length_) {
+      raw_buffer()[length_] = child_values->length();
+    }
+
+    out->Init(type_, length_, values_, ArrayPtr(child_values), nulls_);
+    values_ = nulls_ = nullptr;
+    capacity_ = length_ = 0;
+    return Status::OK();
+  }
+
+  virtual Status ToArray(Array** out) {
+    ListArray* result = new ListArray();
+    RETURN_NOT_OK(Transfer(result));
+    *out = static_cast<Array*>(result);
+    return Status::OK();
+  }
+
+  // Start a new variable-length list slot
+  //
+  // This function should be called before beginning to append elements to the
+  // value builder
+  Status Append(bool is_null = false) {
+    if (length_ == capacity_) {
+      // If the capacity was not already a multiple of 2, do so here
+      RETURN_NOT_OK(Resize(util::next_power2(capacity_ + 1)));
+    }
+    if (nullable_) {
+      util::set_bit(null_bits_, length_, is_null);
+    }
+
+    raw_buffer()[length_++] = value_builder_->length();
+    return Status::OK();
+  }
+
+  // Status Append(int32_t* offsets, int length, uint8_t* null_bytes) {
+  //   return Int32Builder::Append(offsets, length, null_bytes);
+  // }
+
+  Status AppendNull() {
+    return Append(true);
+  }
+
+  ArrayBuilder* value_builder() const { return value_builder_.get();}
+
+ protected:
+  std::unique_ptr<ArrayBuilder> value_builder_;
+};
+
+
+} // namespace arrow
+
+#endif // ARROW_TYPES_LIST_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/null.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/null.h b/cpp/src/arrow/types/null.h
new file mode 100644
index 0000000..c67f752
--- /dev/null
+++ b/cpp/src/arrow/types/null.h
@@ -0,0 +1,34 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_TYPES_NULL_H
+#define ARROW_TYPES_NULL_H
+
+#include <string>
+#include <vector>
+
+#include "arrow/type.h"
+
+namespace arrow {
+
+struct NullType : public PrimitiveType<NullType> {
+  PRIMITIVE_DECL(NullType, void, NA, 0, "null");
+};
+
+} // namespace arrow
+
+#endif // ARROW_TYPES_NULL_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/primitive-test.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/primitive-test.cc b/cpp/src/arrow/types/primitive-test.cc
new file mode 100644
index 0000000..1296860
--- /dev/null
+++ b/cpp/src/arrow/types/primitive-test.cc
@@ -0,0 +1,345 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <gtest/gtest.h>
+
+#include <cstdint>
+#include <cstdlib>
+#include <memory>
+#include <string>
+#include <vector>
+
+#include "arrow/array.h"
+#include "arrow/builder.h"
+#include "arrow/test-util.h"
+#include "arrow/type.h"
+#include "arrow/types/boolean.h"
+#include "arrow/types/construct.h"
+#include "arrow/types/floating.h"
+#include "arrow/types/integer.h"
+#include "arrow/types/primitive.h"
+#include "arrow/types/test-common.h"
+#include "arrow/util/bit-util.h"
+#include "arrow/util/buffer.h"
+#include "arrow/util/status.h"
+
+using std::string;
+using std::unique_ptr;
+using std::vector;
+
+namespace arrow {
+
+TEST(TypesTest, TestBytesType) {
+  BytesType t1(3);
+
+  ASSERT_EQ(t1.type, LayoutEnum::BYTE);
+  ASSERT_EQ(t1.size, 3);
+}
+
+
+#define PRIMITIVE_TEST(KLASS, ENUM, NAME)       \
+  TEST(TypesTest, TestPrimitive_##ENUM) {       \
+    KLASS tp;                                   \
+    KLASS tp_nn(false);                         \
+                                                \
+    ASSERT_EQ(tp.type, TypeEnum::ENUM);         \
+    ASSERT_EQ(tp.name(), string(NAME));         \
+    ASSERT_TRUE(tp.nullable);                   \
+    ASSERT_FALSE(tp_nn.nullable);               \
+                                                \
+    KLASS tp_copy = tp_nn;                      \
+    ASSERT_FALSE(tp_copy.nullable);             \
+  }
+
+PRIMITIVE_TEST(Int8Type, INT8, "int8");
+PRIMITIVE_TEST(Int16Type, INT16, "int16");
+PRIMITIVE_TEST(Int32Type, INT32, "int32");
+PRIMITIVE_TEST(Int64Type, INT64, "int64");
+PRIMITIVE_TEST(UInt8Type, UINT8, "uint8");
+PRIMITIVE_TEST(UInt16Type, UINT16, "uint16");
+PRIMITIVE_TEST(UInt32Type, UINT32, "uint32");
+PRIMITIVE_TEST(UInt64Type, UINT64, "uint64");
+
+PRIMITIVE_TEST(FloatType, FLOAT, "float");
+PRIMITIVE_TEST(DoubleType, DOUBLE, "double");
+
+PRIMITIVE_TEST(BooleanType, BOOL, "bool");
+
+// ----------------------------------------------------------------------
+// Primitive type tests
+
+TEST_F(TestBuilder, TestResize) {
+  builder_->Init(10);
+  ASSERT_EQ(2, builder_->nulls()->size());
+
+  builder_->Resize(30);
+  ASSERT_EQ(4, builder_->nulls()->size());
+}
+
+template <typename Attrs>
+class TestPrimitiveBuilder : public TestBuilder {
+ public:
+  typedef typename Attrs::ArrayType ArrayType;
+  typedef typename Attrs::BuilderType BuilderType;
+  typedef typename Attrs::T T;
+
+  void SetUp() {
+    TestBuilder::SetUp();
+
+    type_ = Attrs::type();
+    type_nn_ = Attrs::type(false);
+
+    ArrayBuilder* tmp;
+    ASSERT_OK(make_builder(type_, &tmp));
+    builder_.reset(static_cast<BuilderType*>(tmp));
+
+    ASSERT_OK(make_builder(type_nn_, &tmp));
+    builder_nn_.reset(static_cast<BuilderType*>(tmp));
+  }
+
+  void RandomData(int64_t N, double pct_null = 0.1) {
+    Attrs::draw(N, &draws_);
+    random_nulls(N, pct_null, &nulls_);
+  }
+
+  void CheckNullable() {
+    ArrayType result;
+    ArrayType expected;
+    int64_t size = builder_->length();
+
+    auto ex_data = std::make_shared<Buffer>(reinterpret_cast<uint8_t*>(draws_.data()),
+        size * sizeof(T));
+
+    auto ex_nulls = bytes_to_null_buffer(nulls_.data(), size);
+
+    expected.Init(size, ex_data, ex_nulls);
+    ASSERT_OK(builder_->Transfer(&result));
+
+    // Builder is now reset
+    ASSERT_EQ(0, builder_->length());
+    ASSERT_EQ(0, builder_->capacity());
+    ASSERT_EQ(nullptr, builder_->buffer());
+
+    ASSERT_TRUE(result.Equals(expected));
+  }
+
+  void CheckNonNullable() {
+    ArrayType result;
+    ArrayType expected;
+    int64_t size = builder_nn_->length();
+
+    auto ex_data = std::make_shared<Buffer>(reinterpret_cast<uint8_t*>(draws_.data()),
+        size * sizeof(T));
+
+    expected.Init(size, ex_data);
+    ASSERT_OK(builder_nn_->Transfer(&result));
+
+    // Builder is now reset
+    ASSERT_EQ(0, builder_nn_->length());
+    ASSERT_EQ(0, builder_nn_->capacity());
+    ASSERT_EQ(nullptr, builder_nn_->buffer());
+
+    ASSERT_TRUE(result.Equals(expected));
+  }
+
+ protected:
+  TypePtr type_;
+  TypePtr type_nn_;
+  unique_ptr<BuilderType> builder_;
+  unique_ptr<BuilderType> builder_nn_;
+
+  vector<T> draws_;
+  vector<uint8_t> nulls_;
+};
+
+#define PTYPE_DECL(CapType, c_type)             \
+  typedef CapType##Array ArrayType;             \
+  typedef CapType##Builder BuilderType;         \
+  typedef CapType##Type Type;                   \
+  typedef c_type T;                             \
+                                                \
+  static TypePtr type(bool nullable = true) {   \
+    return TypePtr(new Type(nullable));         \
+  }
+
+#define PINT_DECL(CapType, c_type, LOWER, UPPER)    \
+  struct P##CapType {                               \
+    PTYPE_DECL(CapType, c_type);                    \
+    static void draw(int64_t N, vector<T>* draws) {  \
+      randint<T>(N, LOWER, UPPER, draws);           \
+    }                                               \
+  }
+
+PINT_DECL(UInt8, uint8_t, 0, UINT8_MAX);
+PINT_DECL(UInt16, uint16_t, 0, UINT16_MAX);
+PINT_DECL(UInt32, uint32_t, 0, UINT32_MAX);
+PINT_DECL(UInt64, uint64_t, 0, UINT64_MAX);
+
+PINT_DECL(Int8, int8_t, INT8_MIN, INT8_MAX);
+PINT_DECL(Int16, int16_t, INT16_MIN, INT16_MAX);
+PINT_DECL(Int32, int32_t, INT32_MIN, INT32_MAX);
+PINT_DECL(Int64, int64_t, INT64_MIN, INT64_MAX);
+
+typedef ::testing::Types<PUInt8, PUInt16, PUInt32, PUInt64,
+                         PInt8, PInt16, PInt32, PInt64> Primitives;
+
+TYPED_TEST_CASE(TestPrimitiveBuilder, Primitives);
+
+#define DECL_T()                                \
+  typedef typename TestFixture::T T;
+
+#define DECL_ARRAYTYPE()                                \
+  typedef typename TestFixture::ArrayType ArrayType;
+
+
+TYPED_TEST(TestPrimitiveBuilder, TestInit) {
+  DECL_T();
+
+  int64_t n = 1000;
+  ASSERT_OK(this->builder_->Init(n));
+  ASSERT_EQ(n, this->builder_->capacity());
+  ASSERT_EQ(n * sizeof(T), this->builder_->buffer()->size());
+
+  // unsure if this should go in all builder classes
+  ASSERT_EQ(0, this->builder_->num_children());
+}
+
+TYPED_TEST(TestPrimitiveBuilder, TestAppendNull) {
+  int size = 10000;
+  for (int i = 0; i < size; ++i) {
+    ASSERT_OK(this->builder_->AppendNull());
+  }
+
+  Array* result;
+  ASSERT_OK(this->builder_->ToArray(&result));
+  unique_ptr<Array> holder(result);
+
+  for (int i = 0; i < size; ++i) {
+    ASSERT_TRUE(result->IsNull(i));
+  }
+}
+
+
+TYPED_TEST(TestPrimitiveBuilder, TestAppendScalar) {
+  DECL_T();
+
+  int size = 10000;
+
+  vector<T>& draws = this->draws_;
+  vector<uint8_t>& nulls = this->nulls_;
+
+  this->RandomData(size);
+
+  int i;
+  // Append the first 1000
+  for (i = 0; i < 1000; ++i) {
+    ASSERT_OK(this->builder_->Append(draws[i], nulls[i] > 0));
+    ASSERT_OK(this->builder_nn_->Append(draws[i]));
+  }
+
+  ASSERT_EQ(1000, this->builder_->length());
+  ASSERT_EQ(1024, this->builder_->capacity());
+
+  ASSERT_EQ(1000, this->builder_nn_->length());
+  ASSERT_EQ(1024, this->builder_nn_->capacity());
+
+  // Append the next 9000
+  for (i = 1000; i < size; ++i) {
+    ASSERT_OK(this->builder_->Append(draws[i], nulls[i] > 0));
+    ASSERT_OK(this->builder_nn_->Append(draws[i]));
+  }
+
+  ASSERT_EQ(size, this->builder_->length());
+  ASSERT_EQ(util::next_power2(size), this->builder_->capacity());
+
+  ASSERT_EQ(size, this->builder_nn_->length());
+  ASSERT_EQ(util::next_power2(size), this->builder_nn_->capacity());
+
+  this->CheckNullable();
+  this->CheckNonNullable();
+}
+
+
+TYPED_TEST(TestPrimitiveBuilder, TestAppendVector) {
+  DECL_T();
+
+  int size = 10000;
+  this->RandomData(size);
+
+  vector<T>& draws = this->draws_;
+  vector<uint8_t>& nulls = this->nulls_;
+
+  // first slug
+  int K = 1000;
+
+  ASSERT_OK(this->builder_->Append(draws.data(), K, nulls.data()));
+  ASSERT_OK(this->builder_nn_->Append(draws.data(), K));
+
+  ASSERT_EQ(1000, this->builder_->length());
+  ASSERT_EQ(1024, this->builder_->capacity());
+
+  ASSERT_EQ(1000, this->builder_nn_->length());
+  ASSERT_EQ(1024, this->builder_nn_->capacity());
+
+  // Append the next 9000
+  ASSERT_OK(this->builder_->Append(draws.data() + K, size - K, nulls.data() + K));
+  ASSERT_OK(this->builder_nn_->Append(draws.data() + K, size - K));
+
+  ASSERT_EQ(size, this->builder_->length());
+  ASSERT_EQ(util::next_power2(size), this->builder_->capacity());
+
+  this->CheckNullable();
+  this->CheckNonNullable();
+}
+
+TYPED_TEST(TestPrimitiveBuilder, TestAdvance) {
+  int n = 1000;
+  ASSERT_OK(this->builder_->Init(n));
+
+  ASSERT_OK(this->builder_->Advance(100));
+  ASSERT_EQ(100, this->builder_->length());
+
+  ASSERT_OK(this->builder_->Advance(900));
+  ASSERT_RAISES(Invalid, this->builder_->Advance(1));
+}
+
+TYPED_TEST(TestPrimitiveBuilder, TestResize) {
+  DECL_T();
+
+  int cap = MIN_BUILDER_CAPACITY * 2;
+
+  ASSERT_OK(this->builder_->Resize(cap));
+  ASSERT_EQ(cap, this->builder_->capacity());
+
+  ASSERT_EQ(cap * sizeof(T), this->builder_->buffer()->size());
+  ASSERT_EQ(util::ceil_byte(cap) / 8, this->builder_->nulls()->size());
+}
+
+TYPED_TEST(TestPrimitiveBuilder, TestReserve) {
+  int n = 100;
+  ASSERT_OK(this->builder_->Reserve(n));
+  ASSERT_EQ(0, this->builder_->length());
+  ASSERT_EQ(MIN_BUILDER_CAPACITY, this->builder_->capacity());
+
+  ASSERT_OK(this->builder_->Advance(100));
+  ASSERT_OK(this->builder_->Reserve(MIN_BUILDER_CAPACITY));
+
+  ASSERT_EQ(util::next_power2(MIN_BUILDER_CAPACITY + 100),
+      this->builder_->capacity());
+}
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/primitive.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/primitive.cc b/cpp/src/arrow/types/primitive.cc
new file mode 100644
index 0000000..2612e8c
--- /dev/null
+++ b/cpp/src/arrow/types/primitive.cc
@@ -0,0 +1,50 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/types/primitive.h"
+
+#include <memory>
+
+#include "arrow/util/buffer.h"
+
+namespace arrow {
+
+// ----------------------------------------------------------------------
+// Primitive array base
+
+void PrimitiveArray::Init(const TypePtr& type, int64_t length,
+    const std::shared_ptr<Buffer>& data,
+    const std::shared_ptr<Buffer>& nulls) {
+  Array::Init(type, length, nulls);
+  data_ = data;
+  raw_data_ = data == nullptr? nullptr : data_->data();
+}
+
+bool PrimitiveArray::Equals(const PrimitiveArray& other) const {
+  if (this == &other) return true;
+  if (type_->nullable != other.type_->nullable) return false;
+
+  bool equal_data = data_->Equals(*other.data_, length_);
+  if (type_->nullable) {
+    return equal_data &&
+      nulls_->Equals(*other.nulls_, util::ceil_byte(length_) / 8);
+  } else {
+    return equal_data;
+  }
+}
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/primitive.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/primitive.h b/cpp/src/arrow/types/primitive.h
new file mode 100644
index 0000000..a419112
--- /dev/null
+++ b/cpp/src/arrow/types/primitive.h
@@ -0,0 +1,240 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_TYPES_PRIMITIVE_H
+#define ARROW_TYPES_PRIMITIVE_H
+
+#include <cstdint>
+#include <cstring>
+#include <string>
+
+#include "arrow/array.h"
+#include "arrow/builder.h"
+#include "arrow/type.h"
+#include "arrow/util/bit-util.h"
+#include "arrow/util/buffer.h"
+#include "arrow/util/status.h"
+
+namespace arrow {
+
+template <typename Derived>
+struct PrimitiveType : public DataType {
+  explicit PrimitiveType(bool nullable = true)
+      : DataType(Derived::type_enum, nullable) {}
+
+  virtual std::string ToString() const {
+    return std::string(static_cast<const Derived*>(this)->name());
+  }
+};
+
+#define PRIMITIVE_DECL(TYPENAME, C_TYPE, ENUM, SIZE, NAME)          \
+  typedef C_TYPE c_type;                                            \
+  static constexpr TypeEnum type_enum = TypeEnum::ENUM;             \
+  static constexpr int size = SIZE;                              \
+                                                                    \
+  explicit TYPENAME(bool nullable = true)                           \
+      : PrimitiveType<TYPENAME>(nullable) {}                        \
+                                                                    \
+  static const char* name() {                                       \
+    return NAME;                                                    \
+  }
+
+
+// Base class for fixed-size logical types
+class PrimitiveArray : public Array {
+ public:
+  PrimitiveArray() : Array(), data_(nullptr), raw_data_(nullptr) {}
+
+  virtual ~PrimitiveArray() {}
+
+  void Init(const TypePtr& type, int64_t length, const std::shared_ptr<Buffer>& data,
+      const std::shared_ptr<Buffer>& nulls = nullptr);
+
+  const std::shared_ptr<Buffer>& data() const { return data_;}
+
+  bool Equals(const PrimitiveArray& other) const;
+
+ protected:
+  std::shared_ptr<Buffer> data_;
+  const uint8_t* raw_data_;
+};
+
+
+template <typename TypeClass>
+class PrimitiveArrayImpl : public PrimitiveArray {
+ public:
+  typedef typename TypeClass::c_type T;
+
+  PrimitiveArrayImpl() : PrimitiveArray() {}
+
+  PrimitiveArrayImpl(int64_t length, const std::shared_ptr<Buffer>& data,
+      const std::shared_ptr<Buffer>& nulls = nullptr) {
+    Init(length, data, nulls);
+  }
+
+  void Init(int64_t length, const std::shared_ptr<Buffer>& data,
+      const std::shared_ptr<Buffer>& nulls = nullptr) {
+    TypePtr type(new TypeClass(nulls != nullptr));
+    PrimitiveArray::Init(type, length, data, nulls);
+  }
+
+  bool Equals(const PrimitiveArrayImpl& other) const {
+    return PrimitiveArray::Equals(*static_cast<const PrimitiveArray*>(&other));
+  }
+
+  const T* raw_data() const { return reinterpret_cast<const T*>(raw_data_);}
+
+  T Value(int64_t i) const {
+    return raw_data()[i];
+  }
+
+  TypeClass* exact_type() const {
+    return static_cast<TypeClass*>(type_);
+  }
+};
+
+
+template <typename Type, typename ArrayType>
+class PrimitiveBuilder : public ArrayBuilder {
+ public:
+  typedef typename Type::c_type T;
+
+  explicit PrimitiveBuilder(const TypePtr& type)
+      : ArrayBuilder(type), values_(nullptr) {
+    elsize_ = sizeof(T);
+  }
+
+  virtual ~PrimitiveBuilder() {}
+
+  Status Resize(int64_t capacity) {
+    // XXX: Set floor size for now
+    if (capacity < MIN_BUILDER_CAPACITY) {
+      capacity = MIN_BUILDER_CAPACITY;
+    }
+
+    if (capacity_ == 0) {
+      RETURN_NOT_OK(Init(capacity));
+    } else {
+      RETURN_NOT_OK(ArrayBuilder::Resize(capacity));
+      RETURN_NOT_OK(values_->Resize(capacity * elsize_));
+      capacity_ = capacity;
+    }
+    return Status::OK();
+  }
+
+  Status Init(int64_t capacity) {
+    RETURN_NOT_OK(ArrayBuilder::Init(capacity));
+
+    values_ = std::make_shared<OwnedMutableBuffer>();
+    return values_->Resize(capacity * elsize_);
+  }
+
+  Status Reserve(int64_t elements) {
+    if (length_ + elements > capacity_) {
+      int64_t new_capacity = util::next_power2(length_ + elements);
+      return Resize(new_capacity);
+    }
+    return Status::OK();
+  }
+
+  Status Advance(int64_t elements) {
+    return ArrayBuilder::Advance(elements);
+  }
+
+  // Scalar append
+  Status Append(T val, bool is_null = false) {
+    if (length_ == capacity_) {
+      // If the capacity was not already a multiple of 2, do so here
+      RETURN_NOT_OK(Resize(util::next_power2(capacity_ + 1)));
+    }
+    if (nullable_) {
+      util::set_bit(null_bits_, length_, is_null);
+    }
+    raw_buffer()[length_++] = val;
+    return Status::OK();
+  }
+
+  // Vector append
+  //
+  // If passed, null_bytes is of equal length to values, and any nonzero byte
+  // will be considered as a null for that slot
+  Status Append(const T* values, int64_t length, uint8_t* null_bytes = nullptr) {
+    if (length_ + length > capacity_) {
+      int64_t new_capacity = util::next_power2(length_ + length);
+      RETURN_NOT_OK(Resize(new_capacity));
+    }
+    memcpy(raw_buffer() + length_, values, length * elsize_);
+
+    if (nullable_ && null_bytes != nullptr) {
+      // If null_bytes is all not null, then none of the values are null
+      for (int64_t i = 0; i < length; ++i) {
+        util::set_bit(null_bits_, length_ + i, static_cast<bool>(null_bytes[i]));
+      }
+    }
+
+    length_ += length;
+    return Status::OK();
+  }
+
+  Status AppendNull() {
+    if (!nullable_) {
+      return Status::Invalid("not nullable");
+    }
+    if (length_ == capacity_) {
+      // If the capacity was not already a multiple of 2, do so here
+      RETURN_NOT_OK(Resize(util::next_power2(capacity_ + 1)));
+    }
+    util::set_bit(null_bits_, length_++, true);
+    return Status::OK();
+  }
+
+  // Initialize an array type instance with the results of this builder
+  // Transfers ownership of all buffers
+  Status Transfer(PrimitiveArray* out) {
+    out->Init(type_, length_, values_, nulls_);
+    values_ = nulls_ = nullptr;
+    capacity_ = length_ = 0;
+    return Status::OK();
+  }
+
+  Status Transfer(ArrayType* out) {
+    return Transfer(static_cast<PrimitiveArray*>(out));
+  }
+
+  virtual Status ToArray(Array** out) {
+    ArrayType* result = new ArrayType();
+    RETURN_NOT_OK(Transfer(result));
+    *out = static_cast<Array*>(result);
+    return Status::OK();
+  }
+
+  T* raw_buffer() {
+    return reinterpret_cast<T*>(values_->mutable_data());
+  }
+
+  std::shared_ptr<Buffer> buffer() const {
+    return values_;
+  }
+
+ protected:
+  std::shared_ptr<OwnedMutableBuffer> values_;
+  int64_t elsize_;
+};
+
+} // namespace arrow
+
+#endif  // ARROW_TYPES_PRIMITIVE_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/string-test.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/string-test.cc b/cpp/src/arrow/types/string-test.cc
new file mode 100644
index 0000000..6dba3fd
--- /dev/null
+++ b/cpp/src/arrow/types/string-test.cc
@@ -0,0 +1,242 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <gtest/gtest.h>
+#include <cstdint>
+#include <memory>
+#include <string>
+#include <vector>
+
+#include "arrow/array.h"
+#include "arrow/builder.h"
+#include "arrow/test-util.h"
+#include "arrow/type.h"
+#include "arrow/types/construct.h"
+#include "arrow/types/integer.h"
+#include "arrow/types/string.h"
+#include "arrow/types/test-common.h"
+#include "arrow/util/status.h"
+
+using std::string;
+using std::unique_ptr;
+using std::vector;
+
+namespace arrow {
+
+
+TEST(TypesTest, TestCharType) {
+  CharType t1(5);
+
+  ASSERT_EQ(t1.type, TypeEnum::CHAR);
+  ASSERT_TRUE(t1.nullable);
+  ASSERT_EQ(t1.size, 5);
+
+  ASSERT_EQ(t1.ToString(), string("char(5)"));
+
+  // Test copy constructor
+  CharType t2 = t1;
+  ASSERT_EQ(t2.type, TypeEnum::CHAR);
+  ASSERT_TRUE(t2.nullable);
+  ASSERT_EQ(t2.size, 5);
+}
+
+
+TEST(TypesTest, TestVarcharType) {
+  VarcharType t1(5);
+
+  ASSERT_EQ(t1.type, TypeEnum::VARCHAR);
+  ASSERT_TRUE(t1.nullable);
+  ASSERT_EQ(t1.size, 5);
+  ASSERT_EQ(t1.physical_type.size, 6);
+
+  ASSERT_EQ(t1.ToString(), string("varchar(5)"));
+
+  // Test copy constructor
+  VarcharType t2 = t1;
+  ASSERT_EQ(t2.type, TypeEnum::VARCHAR);
+  ASSERT_TRUE(t2.nullable);
+  ASSERT_EQ(t2.size, 5);
+  ASSERT_EQ(t2.physical_type.size, 6);
+}
+
+TEST(TypesTest, TestStringType) {
+  StringType str;
+  StringType str_nn(false);
+
+  ASSERT_EQ(str.type, TypeEnum::STRING);
+  ASSERT_EQ(str.name(), string("string"));
+  ASSERT_TRUE(str.nullable);
+  ASSERT_FALSE(str_nn.nullable);
+}
+
+// ----------------------------------------------------------------------
+// String container
+
+class TestStringContainer : public ::testing::Test  {
+ public:
+  void SetUp() {
+    chars_ = {'a', 'b', 'b', 'c', 'c', 'c'};
+    offsets_ = {0, 1, 1, 1, 3, 6};
+    nulls_ = {0, 0, 1, 0, 0};
+    expected_ = {"a", "", "", "bb", "ccc"};
+
+    MakeArray();
+  }
+
+  void MakeArray() {
+    length_ = offsets_.size() - 1;
+    int64_t nchars = chars_.size();
+
+    value_buf_ = to_buffer(chars_);
+    values_ = ArrayPtr(new UInt8Array(nchars, value_buf_));
+
+    offsets_buf_ = to_buffer(offsets_);
+
+    nulls_buf_ = bytes_to_null_buffer(nulls_.data(), nulls_.size());
+    strings_.Init(length_, offsets_buf_, values_, nulls_buf_);
+  }
+
+ protected:
+  vector<int32_t> offsets_;
+  vector<char> chars_;
+  vector<uint8_t> nulls_;
+
+  vector<string> expected_;
+
+  std::shared_ptr<Buffer> value_buf_;
+  std::shared_ptr<Buffer> offsets_buf_;
+  std::shared_ptr<Buffer> nulls_buf_;
+
+  int64_t length_;
+
+  ArrayPtr values_;
+  StringArray strings_;
+};
+
+
+TEST_F(TestStringContainer, TestArrayBasics) {
+  ASSERT_EQ(length_, strings_.length());
+  ASSERT_TRUE(strings_.nullable());
+}
+
+TEST_F(TestStringContainer, TestType) {
+  TypePtr type = strings_.type();
+
+  ASSERT_EQ(TypeEnum::STRING, type->type);
+  ASSERT_EQ(TypeEnum::STRING, strings_.type_enum());
+}
+
+
+TEST_F(TestStringContainer, TestListFunctions) {
+  int pos = 0;
+  for (size_t i = 0; i < expected_.size(); ++i) {
+    ASSERT_EQ(pos, strings_.value_offset(i));
+    ASSERT_EQ(expected_[i].size(), strings_.value_length(i));
+    pos += expected_[i].size();
+  }
+}
+
+
+TEST_F(TestStringContainer, TestDestructor) {
+  auto arr = std::make_shared<StringArray>(length_, offsets_buf_, values_, nulls_buf_);
+}
+
+TEST_F(TestStringContainer, TestGetString) {
+  for (size_t i = 0; i < expected_.size(); ++i) {
+    if (nulls_[i]) {
+      ASSERT_TRUE(strings_.IsNull(i));
+    } else {
+      ASSERT_EQ(expected_[i], strings_.GetString(i));
+    }
+  }
+}
+
+// ----------------------------------------------------------------------
+// String builder tests
+
+class TestStringBuilder : public TestBuilder {
+ public:
+  void SetUp() {
+    TestBuilder::SetUp();
+    type_ = TypePtr(new StringType());
+
+    ArrayBuilder* tmp;
+    ASSERT_OK(make_builder(type_, &tmp));
+    builder_.reset(static_cast<StringBuilder*>(tmp));
+  }
+
+  void Done() {
+    Array* out;
+    ASSERT_OK(builder_->ToArray(&out));
+    result_.reset(static_cast<StringArray*>(out));
+  }
+
+ protected:
+  TypePtr type_;
+
+  unique_ptr<StringBuilder> builder_;
+  unique_ptr<StringArray> result_;
+};
+
+TEST_F(TestStringBuilder, TestAttrs) {
+  ASSERT_FALSE(builder_->value_builder()->nullable());
+}
+
+TEST_F(TestStringBuilder, TestScalarAppend) {
+  vector<string> strings = {"a", "bb", "", "", "ccc"};
+  vector<uint8_t> is_null = {0, 0, 0, 1, 0};
+
+  int N = strings.size();
+  int reps = 1000;
+
+  for (int j = 0; j < reps; ++j) {
+    for (int i = 0; i < N; ++i) {
+      if (is_null[i]) {
+        builder_->AppendNull();
+      } else {
+        builder_->Append(strings[i]);
+      }
+    }
+  }
+  Done();
+
+  ASSERT_EQ(reps * N, result_->length());
+  ASSERT_EQ(reps * 6, result_->values()->length());
+
+  int64_t length;
+  int64_t pos = 0;
+  for (int i = 0; i < N * reps; ++i) {
+    if (is_null[i % N]) {
+      ASSERT_TRUE(result_->IsNull(i));
+    } else {
+      ASSERT_FALSE(result_->IsNull(i));
+      result_->GetValue(i, &length);
+      ASSERT_EQ(pos, result_->offset(i));
+      ASSERT_EQ(strings[i % N].size(), length);
+      ASSERT_EQ(strings[i % N], result_->GetString(i));
+
+      pos += length;
+    }
+  }
+}
+
+TEST_F(TestStringBuilder, TestZeroLength) {
+  // All buffers are null
+  Done();
+}
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/string.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/string.cc b/cpp/src/arrow/types/string.cc
new file mode 100644
index 0000000..f3dfbdc
--- /dev/null
+++ b/cpp/src/arrow/types/string.cc
@@ -0,0 +1,40 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/types/string.h"
+
+#include <sstream>
+#include <string>
+
+namespace arrow {
+
+std::string CharType::ToString() const {
+  std::stringstream s;
+  s << "char(" << size << ")";
+  return s.str();
+}
+
+
+std::string VarcharType::ToString() const {
+  std::stringstream s;
+  s << "varchar(" << size << ")";
+  return s.str();
+}
+
+TypePtr StringBuilder::value_type_ = TypePtr(new UInt8Type(false));
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/string.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/string.h b/cpp/src/arrow/types/string.h
new file mode 100644
index 0000000..30d6e24
--- /dev/null
+++ b/cpp/src/arrow/types/string.h
@@ -0,0 +1,181 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_TYPES_STRING_H
+#define ARROW_TYPES_STRING_H
+
+#include <cstdint>
+#include <memory>
+#include <string>
+#include <vector>
+
+#include "arrow/array.h"
+#include "arrow/type.h"
+#include "arrow/types/integer.h"
+#include "arrow/types/list.h"
+#include "arrow/util/buffer.h"
+#include "arrow/util/status.h"
+
+namespace arrow {
+
+class ArrayBuilder;
+
+struct CharType : public DataType {
+  int size;
+
+  BytesType physical_type;
+
+  explicit CharType(int size, bool nullable = true)
+      : DataType(TypeEnum::CHAR, nullable),
+        size(size),
+        physical_type(BytesType(size)) {}
+
+  CharType(const CharType& other)
+      : CharType(other.size, other.nullable) {}
+
+  virtual std::string ToString() const;
+};
+
+
+// Variable-length, null-terminated strings, up to a certain length
+struct VarcharType : public DataType {
+  int size;
+
+  BytesType physical_type;
+
+  explicit VarcharType(int size, bool nullable = true)
+      : DataType(TypeEnum::VARCHAR, nullable),
+        size(size),
+        physical_type(BytesType(size + 1)) {}
+  VarcharType(const VarcharType& other)
+      : VarcharType(other.size, other.nullable) {}
+
+  virtual std::string ToString() const;
+};
+
+static const LayoutPtr byte1(new BytesType(1));
+static const LayoutPtr physical_string = LayoutPtr(new ListLayoutType(byte1));
+
+// String is a logical type consisting of a physical list of 1-byte values
+struct StringType : public DataType {
+  explicit StringType(bool nullable = true)
+      : DataType(TypeEnum::STRING, nullable) {}
+
+  StringType(const StringType& other)
+      : StringType(other.nullable) {}
+
+  const LayoutPtr& physical_type() {
+    return physical_string;
+  }
+
+  static char const *name() {
+    return "string";
+  }
+
+  virtual std::string ToString() const {
+    return name();
+  }
+};
+
+
+// TODO: add a BinaryArray layer in between
+class StringArray : public ListArray {
+ public:
+  StringArray() : ListArray(), bytes_(nullptr), raw_bytes_(nullptr) {}
+
+  StringArray(int64_t length, const std::shared_ptr<Buffer>& offsets,
+      const ArrayPtr& values,
+      const std::shared_ptr<Buffer>& nulls = nullptr) {
+    Init(length, offsets, values, nulls);
+  }
+
+  void Init(const TypePtr& type, int64_t length,
+      const std::shared_ptr<Buffer>& offsets,
+      const ArrayPtr& values,
+      const std::shared_ptr<Buffer>& nulls = nullptr) {
+    ListArray::Init(type, length, offsets, values, nulls);
+
+    // TODO: type validation for values array
+
+    // For convenience
+    bytes_ = static_cast<UInt8Array*>(values.get());
+    raw_bytes_ = bytes_->raw_data();
+  }
+
+  void Init(int64_t length, const std::shared_ptr<Buffer>& offsets,
+      const ArrayPtr& values,
+      const std::shared_ptr<Buffer>& nulls = nullptr) {
+    TypePtr type(new StringType(nulls != nullptr));
+    Init(type, length, offsets, values, nulls);
+  }
+
+  // Compute the pointer t
+  const uint8_t* GetValue(int64_t i, int64_t* out_length) const {
+    int32_t pos = offsets_[i];
+    *out_length = offsets_[i + 1] - pos;
+    return raw_bytes_ + pos;
+  }
+
+  // Construct a std::string
+  std::string GetString(int64_t i) const {
+    int64_t nchars;
+    const uint8_t* str = GetValue(i, &nchars);
+    return std::string(reinterpret_cast<const char*>(str), nchars);
+  }
+
+ private:
+  UInt8Array* bytes_;
+  const uint8_t* raw_bytes_;
+};
+
+// Array builder
+
+
+
+class StringBuilder : public ListBuilder {
+ public:
+  explicit StringBuilder(const TypePtr& type) :
+      ListBuilder(type, static_cast<ArrayBuilder*>(new UInt8Builder(value_type_))) {
+    byte_builder_ = static_cast<UInt8Builder*>(value_builder_.get());
+  }
+
+  Status Append(const std::string& value) {
+    RETURN_NOT_OK(ListBuilder::Append());
+    return byte_builder_->Append(reinterpret_cast<const uint8_t*>(value.c_str()),
+        value.size());
+  }
+
+  Status Append(const uint8_t* value, int64_t length);
+  Status Append(const std::vector<std::string>& values,
+                uint8_t* null_bytes);
+
+  virtual Status ToArray(Array** out) {
+    StringArray* result = new StringArray();
+    RETURN_NOT_OK(ListBuilder::Transfer(result));
+    *out = static_cast<Array*>(result);
+    return Status::OK();
+  }
+
+ protected:
+  UInt8Builder* byte_builder_;
+
+  static TypePtr value_type_;
+};
+
+} // namespace arrow
+
+#endif // ARROW_TYPES_STRING_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/struct-test.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/struct-test.cc b/cpp/src/arrow/types/struct-test.cc
new file mode 100644
index 0000000..644b545
--- /dev/null
+++ b/cpp/src/arrow/types/struct-test.cc
@@ -0,0 +1,61 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <gtest/gtest.h>
+
+#include <string>
+#include <vector>
+
+#include "arrow/field.h"
+#include "arrow/type.h"
+#include "arrow/types/integer.h"
+#include "arrow/types/string.h"
+#include "arrow/types/struct.h"
+
+using std::string;
+using std::vector;
+
+namespace arrow {
+
+TEST(TestStructType, Basics) {
+  TypePtr f0_type = TypePtr(new Int32Type());
+  Field f0("f0", f0_type);
+
+  TypePtr f1_type = TypePtr(new StringType());
+  Field f1("f1", f1_type);
+
+  TypePtr f2_type = TypePtr(new UInt8Type());
+  Field f2("f2", f2_type);
+
+  vector<Field> fields = {f0, f1, f2};
+
+  StructType struct_type(fields, true);
+  StructType struct_type_nn(fields, false);
+
+  ASSERT_TRUE(struct_type.nullable);
+  ASSERT_FALSE(struct_type_nn.nullable);
+
+  ASSERT_TRUE(struct_type.field(0).Equals(f0));
+  ASSERT_TRUE(struct_type.field(1).Equals(f1));
+  ASSERT_TRUE(struct_type.field(2).Equals(f2));
+
+  ASSERT_EQ(struct_type.ToString(), "struct<f0: int32, f1: string, f2: uint8>");
+
+  // TODO: out of bounds for field(...)
+}
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/struct.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/struct.cc b/cpp/src/arrow/types/struct.cc
new file mode 100644
index 0000000..b7be5d8
--- /dev/null
+++ b/cpp/src/arrow/types/struct.cc
@@ -0,0 +1,38 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/types/struct.h"
+
+#include <memory>
+#include <sstream>
+#include <string>
+
+namespace arrow {
+
+std::string StructType::ToString() const {
+  std::stringstream s;
+  s << "struct<";
+  for (size_t i = 0; i < fields_.size(); ++i) {
+    if (i > 0) s << ", ";
+    const Field& field  = fields_[i];
+    s << field.name << ": " << field.type->ToString();
+  }
+  s << ">";
+  return s.str();
+}
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/struct.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/struct.h b/cpp/src/arrow/types/struct.h
new file mode 100644
index 0000000..7d8885b
--- /dev/null
+++ b/cpp/src/arrow/types/struct.h
@@ -0,0 +1,51 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_TYPES_STRUCT_H
+#define ARROW_TYPES_STRUCT_H
+
+#include <string>
+#include <vector>
+
+#include "arrow/field.h"
+#include "arrow/type.h"
+
+namespace arrow {
+
+struct StructType : public DataType {
+  std::vector<Field> fields_;
+
+  StructType(const std::vector<Field>& fields,
+      bool nullable = true)
+      : DataType(TypeEnum::STRUCT, nullable) {
+    fields_ = fields;
+  }
+
+  const Field& field(int i) const {
+    return fields_[i];
+  }
+
+  int num_children() const {
+    return fields_.size();
+  }
+
+  virtual std::string ToString() const;
+};
+
+} // namespace arrow
+
+#endif // ARROW_TYPES_STRUCT_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/test-common.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/test-common.h b/cpp/src/arrow/types/test-common.h
new file mode 100644
index 0000000..267e48a
--- /dev/null
+++ b/cpp/src/arrow/types/test-common.h
@@ -0,0 +1,50 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_TYPES_TEST_COMMON_H
+#define ARROW_TYPES_TEST_COMMON_H
+
+#include <gtest/gtest.h>
+#include <memory>
+#include <string>
+#include <vector>
+
+#include "arrow/test-util.h"
+#include "arrow/type.h"
+
+using std::unique_ptr;
+
+namespace arrow {
+
+class TestBuilder : public ::testing::Test {
+ public:
+  void SetUp() {
+    type_ = TypePtr(new UInt8Type());
+    type_nn_ = TypePtr(new UInt8Type(false));
+    builder_.reset(new UInt8Builder(type_));
+    builder_nn_.reset(new UInt8Builder(type_nn_));
+  }
+ protected:
+  TypePtr type_;
+  TypePtr type_nn_;
+  unique_ptr<ArrayBuilder> builder_;
+  unique_ptr<ArrayBuilder> builder_nn_;
+};
+
+} // namespace arrow
+
+#endif // ARROW_TYPES_TEST_COMMON_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/union.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/union.cc b/cpp/src/arrow/types/union.cc
new file mode 100644
index 0000000..54f41a7
--- /dev/null
+++ b/cpp/src/arrow/types/union.cc
@@ -0,0 +1,49 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/types/union.h"
+
+#include <sstream>
+#include <string>
+#include <vector>
+
+#include "arrow/type.h"
+
+namespace arrow {
+
+static inline std::string format_union(const std::vector<TypePtr>& child_types) {
+  std::stringstream s;
+  s << "union<";
+  for (size_t i = 0; i < child_types.size(); ++i) {
+    if (i) s << ", ";
+    s << child_types[i]->ToString();
+  }
+  s << ">";
+  return s.str();
+}
+
+std::string DenseUnionType::ToString() const {
+  return format_union(child_types_);
+}
+
+
+std::string SparseUnionType::ToString() const {
+  return format_union(child_types_);
+}
+
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/types/union.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/types/union.h b/cpp/src/arrow/types/union.h
new file mode 100644
index 0000000..7b66c3b
--- /dev/null
+++ b/cpp/src/arrow/types/union.h
@@ -0,0 +1,86 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_TYPES_UNION_H
+#define ARROW_TYPES_UNION_H
+
+#include <memory>
+#include <string>
+#include <vector>
+
+#include "arrow/array.h"
+#include "arrow/type.h"
+#include "arrow/types/collection.h"
+
+namespace arrow {
+
+class Buffer;
+
+struct DenseUnionType : public CollectionType<TypeEnum::DENSE_UNION> {
+  typedef CollectionType<TypeEnum::DENSE_UNION> Base;
+
+  DenseUnionType(const std::vector<TypePtr>& child_types,
+      bool nullable = true)
+      : Base(nullable) {
+    child_types_ = child_types;
+  }
+
+  virtual std::string ToString() const;
+};
+
+
+struct SparseUnionType : public CollectionType<TypeEnum::SPARSE_UNION> {
+  typedef CollectionType<TypeEnum::SPARSE_UNION> Base;
+
+  SparseUnionType(const std::vector<TypePtr>& child_types,
+      bool nullable = true)
+      : Base(nullable) {
+    child_types_ = child_types;
+  }
+
+  virtual std::string ToString() const;
+};
+
+
+class UnionArray : public Array {
+ public:
+  UnionArray() : Array() {}
+
+ protected:
+  // The data are types encoded as int16
+  Buffer* types_;
+  std::vector<std::shared_ptr<Array> > children_;
+};
+
+
+class DenseUnionArray : public UnionArray {
+ public:
+  DenseUnionArray() : UnionArray() {}
+
+ protected:
+  Buffer* offset_buf_;
+};
+
+
+class SparseUnionArray : public UnionArray {
+ public:
+  SparseUnionArray() : UnionArray() {}
+};
+
+} // namespace arrow
+
+#endif // ARROW_TYPES_UNION_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/util/CMakeLists.txt
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/util/CMakeLists.txt b/cpp/src/arrow/util/CMakeLists.txt
new file mode 100644
index 0000000..88e3f7a
--- /dev/null
+++ b/cpp/src/arrow/util/CMakeLists.txt
@@ -0,0 +1,81 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+#######################################
+# arrow_util
+#######################################
+
+set(UTIL_SRCS
+  bit-util.cc
+  buffer.cc
+  status.cc
+)
+
+set(UTIL_LIBS
+  rt)
+
+add_library(arrow_util STATIC
+  ${UTIL_SRCS}
+)
+target_link_libraries(arrow_util ${UTIL_LIBS})
+SET_TARGET_PROPERTIES(arrow_util PROPERTIES LINKER_LANGUAGE CXX)
+
+# Headers: top level
+install(FILES
+  bit-util.h
+  buffer.h
+  macros.h
+  status.h
+  DESTINATION include/arrow/util)
+
+#######################################
+# arrow_test_util
+#######################################
+
+add_library(arrow_test_util)
+target_link_libraries(arrow_test_util
+  arrow_util)
+
+SET_TARGET_PROPERTIES(arrow_test_util PROPERTIES LINKER_LANGUAGE CXX)
+
+#######################################
+# arrow_test_main
+#######################################
+
+add_library(arrow_test_main
+  test_main.cc)
+
+if (APPLE)
+  target_link_libraries(arrow_test_main
+    gtest
+	arrow_util
+	arrow_test_util
+    dl)
+  set_target_properties(arrow_test_main
+        PROPERTIES LINK_FLAGS "-undefined dynamic_lookup")
+else()
+  target_link_libraries(arrow_test_main
+    gtest
+	arrow_util
+	arrow_test_util
+    pthread
+    dl
+  )
+endif()
+
+ADD_ARROW_TEST(bit-util-test)
+ADD_ARROW_TEST(buffer-test)

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/util/bit-util-test.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/util/bit-util-test.cc b/cpp/src/arrow/util/bit-util-test.cc
new file mode 100644
index 0000000..7506ca5
--- /dev/null
+++ b/cpp/src/arrow/util/bit-util-test.cc
@@ -0,0 +1,44 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <gtest/gtest.h>
+
+#include "arrow/util/bit-util.h"
+
+namespace arrow {
+
+TEST(UtilTests, TestNextPower2) {
+  using util::next_power2;
+
+  ASSERT_EQ(8, next_power2(6));
+  ASSERT_EQ(8, next_power2(8));
+
+  ASSERT_EQ(1, next_power2(1));
+  ASSERT_EQ(256, next_power2(131));
+
+  ASSERT_EQ(1024, next_power2(1000));
+
+  ASSERT_EQ(4096, next_power2(4000));
+
+  ASSERT_EQ(65536, next_power2(64000));
+
+  ASSERT_EQ(1LL << 32, next_power2((1LL << 32) - 1));
+  ASSERT_EQ(1LL << 31, next_power2((1LL << 31) - 1));
+  ASSERT_EQ(1LL << 62, next_power2((1LL << 62) - 1));
+}
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/util/bit-util.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/util/bit-util.cc b/cpp/src/arrow/util/bit-util.cc
new file mode 100644
index 0000000..d2ddd65
--- /dev/null
+++ b/cpp/src/arrow/util/bit-util.cc
@@ -0,0 +1,46 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <cstring>
+
+#include "arrow/util/bit-util.h"
+#include "arrow/util/buffer.h"
+#include "arrow/util/status.h"
+
+namespace arrow {
+
+void util::bytes_to_bits(uint8_t* bytes, int length, uint8_t* bits) {
+  for (int i = 0; i < length; ++i) {
+    set_bit(bits, i, static_cast<bool>(bytes[i]));
+  }
+}
+
+Status util::bytes_to_bits(uint8_t* bytes, int length,
+    std::shared_ptr<Buffer>* out) {
+  int bit_length = ceil_byte(length) / 8;
+
+  auto buffer = std::make_shared<OwnedMutableBuffer>();
+  RETURN_NOT_OK(buffer->Resize(bit_length));
+  memset(buffer->mutable_data(), 0, bit_length);
+  bytes_to_bits(bytes, length, buffer->mutable_data());
+
+  *out = buffer;
+
+  return Status::OK();
+}
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/util/bit-util.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/util/bit-util.h b/cpp/src/arrow/util/bit-util.h
new file mode 100644
index 0000000..61dffa3
--- /dev/null
+++ b/cpp/src/arrow/util/bit-util.h
@@ -0,0 +1,68 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_UTIL_BIT_UTIL_H
+#define ARROW_UTIL_BIT_UTIL_H
+
+#include <cstdint>
+#include <cstdlib>
+#include <memory>
+
+#include "arrow/util/buffer.h"
+
+namespace arrow {
+
+class Status;
+
+namespace util {
+
+static inline int64_t ceil_byte(int64_t size) {
+  return (size + 7) & ~7;
+}
+
+static inline int64_t ceil_2bytes(int64_t size) {
+  return (size + 15) & ~15;
+}
+
+static inline bool get_bit(const uint8_t* bits, int i) {
+  return bits[i / 8] & (1 << (i % 8));
+}
+
+static inline void set_bit(uint8_t* bits, int i, bool is_set) {
+  bits[i / 8] |= (1 << (i % 8)) * is_set;
+}
+
+static inline int64_t next_power2(int64_t n) {
+  n--;
+  n |= n >> 1;
+  n |= n >> 2;
+  n |= n >> 4;
+  n |= n >> 8;
+  n |= n >> 16;
+  n |= n >> 32;
+  n++;
+  return n;
+}
+
+void bytes_to_bits(uint8_t* bytes, int length, uint8_t* bits);
+Status bytes_to_bits(uint8_t*, int, std::shared_ptr<Buffer>*);
+
+} // namespace util
+
+} // namespace arrow
+
+#endif // ARROW_UTIL_BIT_UTIL_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/util/buffer-test.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/util/buffer-test.cc b/cpp/src/arrow/util/buffer-test.cc
new file mode 100644
index 0000000..edfd08e
--- /dev/null
+++ b/cpp/src/arrow/util/buffer-test.cc
@@ -0,0 +1,58 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <gtest/gtest.h>
+#include <cstdlib>
+#include <cstdint>
+#include <limits>
+#include <memory>
+#include <string>
+
+#include "arrow/test-util.h"
+#include "arrow/util/buffer.h"
+#include "arrow/util/status.h"
+
+using std::string;
+
+namespace arrow {
+
+class TestBuffer : public ::testing::Test {
+};
+
+TEST_F(TestBuffer, Resize) {
+  OwnedMutableBuffer buf;
+
+  ASSERT_EQ(0, buf.size());
+  ASSERT_OK(buf.Resize(100));
+  ASSERT_EQ(100, buf.size());
+  ASSERT_OK(buf.Resize(200));
+  ASSERT_EQ(200, buf.size());
+
+  // Make it smaller, too
+  ASSERT_OK(buf.Resize(50));
+  ASSERT_EQ(50, buf.size());
+}
+
+TEST_F(TestBuffer, ResizeOOM) {
+  // realloc fails, even though there may be no explicit limit
+  OwnedMutableBuffer buf;
+  ASSERT_OK(buf.Resize(100));
+  int64_t to_alloc = std::numeric_limits<int64_t>::max();
+  ASSERT_RAISES(OutOfMemory, buf.Resize(to_alloc));
+}
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/util/buffer.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/util/buffer.cc b/cpp/src/arrow/util/buffer.cc
new file mode 100644
index 0000000..2fb34d5
--- /dev/null
+++ b/cpp/src/arrow/util/buffer.cc
@@ -0,0 +1,53 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/util/buffer.h"
+
+#include <cstdint>
+
+#include "arrow/util/status.h"
+
+namespace arrow {
+
+Buffer::Buffer(const std::shared_ptr<Buffer>& parent, int64_t offset,
+    int64_t size) {
+  data_ = parent->data() + offset;
+  size_ = size;
+  parent_ = parent;
+}
+
+std::shared_ptr<Buffer> MutableBuffer::GetImmutableView() {
+  return std::make_shared<Buffer>(this->get_shared_ptr(), 0, size());
+}
+
+OwnedMutableBuffer::OwnedMutableBuffer() :
+    MutableBuffer(nullptr, 0) {}
+
+Status OwnedMutableBuffer::Resize(int64_t new_size) {
+  size_ = new_size;
+  try {
+    buffer_owner_.resize(new_size);
+  } catch (const std::bad_alloc& e) {
+    return Status::OutOfMemory("resize failed");
+  }
+  data_ = buffer_owner_.data();
+  mutable_data_ = buffer_owner_.data();
+
+  return Status::OK();
+}
+
+} // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/util/buffer.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/util/buffer.h b/cpp/src/arrow/util/buffer.h
new file mode 100644
index 0000000..3e41839
--- /dev/null
+++ b/cpp/src/arrow/util/buffer.h
@@ -0,0 +1,133 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_UTIL_BUFFER_H
+#define ARROW_UTIL_BUFFER_H
+
+#include <cstdint>
+#include <cstdlib>
+#include <cstring>
+#include <memory>
+#include <vector>
+
+#include "arrow/util/macros.h"
+
+namespace arrow {
+
+class Status;
+
+// ----------------------------------------------------------------------
+// Buffer classes
+
+// Immutable API for a chunk of bytes which may or may not be owned by the
+// class instance
+class Buffer : public std::enable_shared_from_this<Buffer> {
+ public:
+  Buffer(const uint8_t* data, int64_t size) :
+      data_(data),
+      size_(size) {}
+
+  // An offset into data that is owned by another buffer, but we want to be
+  // able to retain a valid pointer to it even after other shared_ptr's to the
+  // parent buffer have been destroyed
+  Buffer(const std::shared_ptr<Buffer>& parent, int64_t offset, int64_t size);
+
+  std::shared_ptr<Buffer> get_shared_ptr() {
+    return shared_from_this();
+  }
+
+  // Return true if both buffers are the same size and contain the same bytes
+  // up to the number of compared bytes
+  bool Equals(const Buffer& other, int64_t nbytes) const {
+    return this == &other ||
+      (size_ >= nbytes && other.size_ >= nbytes &&
+          !memcmp(data_, other.data_, nbytes));
+  }
+
+  bool Equals(const Buffer& other) const {
+    return this == &other ||
+      (size_ == other.size_ && !memcmp(data_, other.data_, size_));
+  }
+
+  const uint8_t* data() const {
+    return data_;
+  }
+
+  int64_t size() const {
+    return size_;
+  }
+
+  // Returns true if this Buffer is referencing memory (possibly) owned by some
+  // other buffer
+  bool is_shared() const {
+    return static_cast<bool>(parent_);
+  }
+
+  const std::shared_ptr<Buffer> parent() const {
+    return parent_;
+  }
+
+ protected:
+  const uint8_t* data_;
+  int64_t size_;
+
+  // nullptr by default, but may be set
+  std::shared_ptr<Buffer> parent_;
+
+ private:
+  DISALLOW_COPY_AND_ASSIGN(Buffer);
+};
+
+// A Buffer whose contents can be mutated. May or may not own its data.
+class MutableBuffer : public Buffer {
+ public:
+  MutableBuffer(uint8_t* data, int64_t size) :
+      Buffer(data, size) {
+    mutable_data_ = data;
+  }
+
+  uint8_t* mutable_data() {
+    return mutable_data_;
+  }
+
+  // Get a read-only view of this buffer
+  std::shared_ptr<Buffer> GetImmutableView();
+
+ protected:
+  MutableBuffer() :
+      Buffer(nullptr, 0),
+      mutable_data_(nullptr) {}
+
+  uint8_t* mutable_data_;
+};
+
+// A MutableBuffer whose memory is owned by the class instance. For example,
+// for reading data out of files that you want to deallocate when this class is
+// garbage-collected
+class OwnedMutableBuffer : public MutableBuffer {
+ public:
+  OwnedMutableBuffer();
+  Status Resize(int64_t new_size);
+
+ private:
+  // TODO: aligned allocations
+  std::vector<uint8_t> buffer_owner_;
+};
+
+} // namespace arrow
+
+#endif // ARROW_UTIL_BUFFER_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/util/macros.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/util/macros.h b/cpp/src/arrow/util/macros.h
new file mode 100644
index 0000000..069e627
--- /dev/null
+++ b/cpp/src/arrow/util/macros.h
@@ -0,0 +1,26 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef ARROW_UTIL_MACROS_H
+#define ARROW_UTIL_MACROS_H
+
+// From Google gutil
+#define DISALLOW_COPY_AND_ASSIGN(TypeName)      \
+  TypeName(const TypeName&) = delete;           \
+  void operator=(const TypeName&) = delete
+
+#endif // ARROW_UTIL_MACROS_H

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/util/random.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/util/random.h b/cpp/src/arrow/util/random.h
new file mode 100644
index 0000000..64c197e
--- /dev/null
+++ b/cpp/src/arrow/util/random.h
@@ -0,0 +1,128 @@
+// Copyright (c) 2011 The LevelDB Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file. See the AUTHORS file for names of contributors.
+
+// Moved from Kudu http://github.com/cloudera/kudu
+
+#ifndef ARROW_UTIL_RANDOM_H_
+#define ARROW_UTIL_RANDOM_H_
+
+#include <stdint.h>
+
+#include <cmath>
+
+namespace arrow {
+
+namespace random_internal {
+
+static const uint32_t M = 2147483647L;   // 2^31-1
+const double kTwoPi = 6.283185307179586476925286;
+
+} // namespace random_internal
+
+// A very simple random number generator.  Not especially good at
+// generating truly random bits, but good enough for our needs in this
+// package. This implementation is not thread-safe.
+class Random {
+ public:
+  explicit Random(uint32_t s) : seed_(s & 0x7fffffffu) {
+    // Avoid bad seeds.
+    if (seed_ == 0 || seed_ == random_internal::M) {
+      seed_ = 1;
+    }
+  }
+
+  // Next pseudo-random 32-bit unsigned integer.
+  // FIXME: This currently only generates 31 bits of randomness.
+  // The MSB will always be zero.
+  uint32_t Next() {
+    static const uint64_t A = 16807;  // bits 14, 8, 7, 5, 2, 1, 0
+    // We are computing
+    //       seed_ = (seed_ * A) % M,    where M = 2^31-1
+    //
+    // seed_ must not be zero or M, or else all subsequent computed values
+    // will be zero or M respectively.  For all other values, seed_ will end
+    // up cycling through every number in [1,M-1]
+    uint64_t product = seed_ * A;
+
+    // Compute (product % M) using the fact that ((x << 31) % M) == x.
+    seed_ = static_cast<uint32_t>((product >> 31) + (product & random_internal::M));
+    // The first reduction may overflow by 1 bit, so we may need to
+    // repeat.  mod == M is not possible; using > allows the faster
+    // sign-bit-based test.
+    if (seed_ > random_internal::M) {
+      seed_ -= random_internal::M;
+    }
+    return seed_;
+  }
+
+  // Alias for consistency with Next64
+  uint32_t Next32() { return Next(); }
+
+  // Next pseudo-random 64-bit unsigned integer.
+  // FIXME: This currently only generates 62 bits of randomness due to Next()
+  // only giving 31 bits of randomness. The 2 most significant bits will always
+  // be zero.
+  uint64_t Next64() {
+    uint64_t large = Next();
+    // Only shift by 31 bits so we end up with zeros in MSB and not scattered
+    // throughout the 64-bit word. This is due to the weakness in Next() noted
+    // above.
+    large <<= 31;
+    large |= Next();
+    return large;
+  }
+
+  // Returns a uniformly distributed value in the range [0..n-1]
+  // REQUIRES: n > 0
+  uint32_t Uniform(uint32_t n) { return Next() % n; }
+
+  // Alias for consistency with Uniform64
+  uint32_t Uniform32(uint32_t n) { return Uniform(n); }
+
+  // Returns a uniformly distributed 64-bit value in the range [0..n-1]
+  // REQUIRES: n > 0
+  uint64_t Uniform64(uint64_t n) { return Next64() % n; }
+
+  // Randomly returns true ~"1/n" of the time, and false otherwise.
+  // REQUIRES: n > 0
+  bool OneIn(int n) { return (Next() % n) == 0; }
+
+  // Skewed: pick "base" uniformly from range [0,max_log] and then
+  // return "base" random bits.  The effect is to pick a number in the
+  // range [0,2^max_log-1] with exponential bias towards smaller numbers.
+  uint32_t Skewed(int max_log) {
+    return Uniform(1 << Uniform(max_log + 1));
+  }
+
+  // Creates a normal distribution variable using the
+  // Box-Muller transform. See:
+  // http://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform
+  // Adapted from WebRTC source code at:
+  // webrtc/trunk/modules/video_coding/main/test/test_util.cc
+  double Normal(double mean, double std_dev) {
+    double uniform1 = (Next() + 1.0) / (random_internal::M + 1.0);
+    double uniform2 = (Next() + 1.0) / (random_internal::M + 1.0);
+    return (mean + std_dev * sqrt(-2 * ::log(uniform1)) *
+        cos(random_internal::kTwoPi * uniform2));
+  }
+
+  // Return a random number between 0.0 and 1.0 inclusive.
+  double NextDoubleFraction() {
+    return Next() / static_cast<double>(random_internal::M + 1.0);
+  }
+
+ private:
+  uint32_t seed_;
+};
+
+
+uint32_t random_seed() {
+  // TODO: use system time to get a reasonably random seed
+  return 0;
+}
+
+
+} // namespace arrow
+
+#endif  // ARROW_UTIL_RANDOM_H_

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/util/status.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/util/status.cc b/cpp/src/arrow/util/status.cc
new file mode 100644
index 0000000..c64b8a3
--- /dev/null
+++ b/cpp/src/arrow/util/status.cc
@@ -0,0 +1,38 @@
+// Copyright (c) 2011 The LevelDB Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file. See the AUTHORS file for names of contributors.
+//
+// A Status encapsulates the result of an operation.  It may indicate success,
+// or it may indicate an error with an associated error message.
+//
+// Multiple threads can invoke const methods on a Status without
+// external synchronization, but if any of the threads may call a
+// non-const method, all threads accessing the same Status must use
+// external synchronization.
+
+#include "arrow/util/status.h"
+
+#include <assert.h>
+
+namespace arrow {
+
+Status::Status(StatusCode code, const std::string& msg, int16_t posix_code) {
+  assert(code != StatusCode::OK);
+  const uint32_t size = msg.size();
+  char* result = new char[size + 7];
+  memcpy(result, &size, sizeof(size));
+  result[4] = static_cast<char>(code);
+  memcpy(result + 5, &posix_code, sizeof(posix_code));
+  memcpy(result + 7, msg.c_str(), msg.size());
+  state_ = result;
+}
+
+const char* Status::CopyState(const char* state) {
+  uint32_t size;
+  memcpy(&size, state, sizeof(size));
+  char* result = new char[size + 7];
+  memcpy(result, state, size + 7);
+  return result;
+}
+
+} // namespace arrow


[09/17] arrow git commit: ARROW-1: Initial Arrow Code Commit

Posted by ja...@apache.org.
http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/util/HistoricalLog.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/util/HistoricalLog.java b/java/memory/src/main/java/org/apache/arrow/memory/util/HistoricalLog.java
new file mode 100644
index 0000000..38cb779
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/util/HistoricalLog.java
@@ -0,0 +1,185 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory.util;
+
+import java.util.Arrays;
+import java.util.LinkedList;
+
+import org.slf4j.Logger;
+
+/**
+ * Utility class that can be used to log activity within a class
+ * for later logging and debugging. Supports recording events and
+ * recording the stack at the time they occur.
+ */
+public class HistoricalLog {
+  private static class Event {
+    private final String note; // the event text
+    private final StackTrace stackTrace; // where the event occurred
+    private final long time;
+
+    public Event(final String note) {
+      this.note = note;
+      this.time = System.nanoTime();
+      stackTrace = new StackTrace();
+    }
+  }
+
+  private final LinkedList<Event> history = new LinkedList<>();
+  private final String idString; // the formatted id string
+  private Event firstEvent; // the first stack trace recorded
+  private final int limit; // the limit on the number of events kept
+
+  /**
+   * Constructor. The format string will be formatted and have its arguments
+   * substituted at the time this is called.
+   *
+   * @param idStringFormat {@link String#format} format string that can be used
+   *     to identify this object in a log. Including some kind of unique identifier
+   *     that can be associated with the object instance is best.
+   * @param args for the format string, or nothing if none are required
+   */
+  public HistoricalLog(final String idStringFormat, Object... args) {
+    this(Integer.MAX_VALUE, idStringFormat, args);
+  }
+
+  /**
+   * Constructor. The format string will be formatted and have its arguments
+   * substituted at the time this is called.
+   *
+   * <p>This form supports the specification of a limit that will limit the
+   * number of historical entries kept (which keeps down the amount of memory
+   * used). With the limit, the first entry made is always kept (under the
+   * assumption that this is the creation site of the object, which is usually
+   * interesting), and then up to the limit number of entries are kept after that.
+   * Each time a new entry is made, the oldest that is not the first is dropped.
+   * </p>
+   *
+   * @param limit the maximum number of historical entries that will be kept,
+   *   not including the first entry made
+   * @param idStringFormat {@link String#format} format string that can be used
+   *     to identify this object in a log. Including some kind of unique identifier
+   *     that can be associated with the object instance is best.
+   * @param args for the format string, or nothing if none are required
+   */
+  public HistoricalLog(final int limit, final String idStringFormat, Object... args) {
+    this.limit = limit;
+    this.idString = String.format(idStringFormat, args);
+  }
+
+  /**
+   * Record an event. Automatically captures the stack trace at the time this is
+   * called. The format string will be formatted and have its arguments substituted
+   * at the time this is called.
+   *
+   * @param noteFormat {@link String#format} format string that describes the event
+   * @param args for the format string, or nothing if none are required
+   */
+  public synchronized void recordEvent(final String noteFormat, Object... args) {
+    final String note = String.format(noteFormat, args);
+    final Event event = new Event(note);
+    if (firstEvent == null) {
+      firstEvent = event;
+    }
+    if (history.size() == limit) {
+      history.removeFirst();
+    }
+    history.add(event);
+  }
+
+  /**
+   * Write the history of this object to the given {@link StringBuilder}. The history
+   * includes the identifying string provided at construction time, and all the recorded
+   * events with their stack traces.
+   *
+   * @param sb {@link StringBuilder} to write to
+   */
+  public void buildHistory(final StringBuilder sb, boolean includeStackTrace) {
+    buildHistory(sb, 0, includeStackTrace);
+  }
+
+  /**
+   * Write the history of this object to the given {@link StringBuilder}. The history
+   * includes the identifying string provided at construction time, and all the recorded
+   * events with their stack traces.
+   *
+   * @param sb {@link StringBuilder} to write to
+   * @param additional an extra string that will be written between the identifying
+   *     information and the history; often used for a current piece of state
+   */
+
+  /**
+   *
+   * @param sb
+   * @param indexLevel
+   * @param includeStackTrace
+   */
+  public synchronized void buildHistory(final StringBuilder sb, int indent, boolean includeStackTrace) {
+    final char[] indentation = new char[indent];
+    final char[] innerIndentation = new char[indent + 2];
+    Arrays.fill(indentation, ' ');
+    Arrays.fill(innerIndentation, ' ');
+
+    sb.append(indentation)
+        .append("event log for: ")
+        .append(idString)
+        .append('\n');
+
+
+    if (firstEvent != null) {
+      sb.append(innerIndentation)
+          .append(firstEvent.time)
+          .append(' ')
+          .append(firstEvent.note)
+          .append('\n');
+      if (includeStackTrace) {
+        firstEvent.stackTrace.writeToBuilder(sb, indent + 2);
+      }
+
+      for(final Event event : history) {
+        if (event == firstEvent) {
+          continue;
+        }
+        sb.append(innerIndentation)
+            .append("  ")
+            .append(event.time)
+            .append(' ')
+            .append(event.note)
+            .append('\n');
+
+        if (includeStackTrace) {
+          event.stackTrace.writeToBuilder(sb, indent + 2);
+          sb.append('\n');
+        }
+      }
+    }
+  }
+
+  /**
+   * Write the history of this object to the given {@link Logger}. The history
+   * includes the identifying string provided at construction time, and all the recorded
+   * events with their stack traces.
+   *
+   * @param logger {@link Logger} to write to
+   */
+  public void logHistory(final Logger logger) {
+    final StringBuilder sb = new StringBuilder();
+    buildHistory(sb, 0, true);
+    logger.debug(sb.toString());
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/util/Metrics.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/util/Metrics.java b/java/memory/src/main/java/org/apache/arrow/memory/util/Metrics.java
new file mode 100644
index 0000000..5177a24
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/util/Metrics.java
@@ -0,0 +1,40 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory.util;
+
+import com.codahale.metrics.MetricRegistry;
+
+public class Metrics {
+
+  private Metrics() {
+
+  }
+
+  private static class RegistryHolder {
+    public static final MetricRegistry REGISTRY;
+
+    static {
+      REGISTRY = new MetricRegistry();
+    }
+
+  }
+
+  public static MetricRegistry getInstance() {
+    return RegistryHolder.REGISTRY;
+  }
+}
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/util/Pointer.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/util/Pointer.java b/java/memory/src/main/java/org/apache/arrow/memory/util/Pointer.java
new file mode 100644
index 0000000..58ab13b
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/util/Pointer.java
@@ -0,0 +1,28 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory.util;
+
+public class Pointer<T> {
+  public T value;
+
+  public Pointer(){}
+
+  public Pointer(T value){
+    this.value = value;
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/java/org/apache/arrow/memory/util/StackTrace.java
----------------------------------------------------------------------
diff --git a/java/memory/src/main/java/org/apache/arrow/memory/util/StackTrace.java b/java/memory/src/main/java/org/apache/arrow/memory/util/StackTrace.java
new file mode 100644
index 0000000..638c2fb
--- /dev/null
+++ b/java/memory/src/main/java/org/apache/arrow/memory/util/StackTrace.java
@@ -0,0 +1,70 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory.util;
+
+import java.util.Arrays;
+
+/**
+ * Convenient way of obtaining and manipulating stack traces for debugging.
+ */
+public class StackTrace {
+  private final StackTraceElement[] stackTraceElements;
+
+  /**
+   * Constructor. Captures the current stack trace.
+   */
+  public StackTrace() {
+    // skip over the first element so that we don't include this constructor call
+    final StackTraceElement[] stack = Thread.currentThread().getStackTrace();
+    stackTraceElements = Arrays.copyOfRange(stack, 1, stack.length - 1);
+  }
+
+  /**
+   * Write the stack trace to a StringBuilder.
+   * @param sb
+   *          where to write it
+   * @param indent
+   *          how many double spaces to indent each line
+   */
+  public void writeToBuilder(final StringBuilder sb, final int indent) {
+    // create the indentation string
+    final char[] indentation = new char[indent * 2];
+    Arrays.fill(indentation, ' ');
+
+    // write the stack trace in standard Java format
+    for(StackTraceElement ste : stackTraceElements) {
+      sb.append(indentation)
+          .append("at ")
+          .append(ste.getClassName())
+          .append('.')
+          .append(ste.getMethodName())
+          .append('(')
+          .append(ste.getFileName())
+          .append(':')
+          .append(Integer.toString(ste.getLineNumber()))
+          .append(")\n");
+    }
+  }
+
+  @Override
+  public String toString() {
+    final StringBuilder sb = new StringBuilder();
+    writeToBuilder(sb, 0);
+    return sb.toString();
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/main/resources/drill-module.conf
----------------------------------------------------------------------
diff --git a/java/memory/src/main/resources/drill-module.conf b/java/memory/src/main/resources/drill-module.conf
new file mode 100644
index 0000000..593ef8e
--- /dev/null
+++ b/java/memory/src/main/resources/drill-module.conf
@@ -0,0 +1,25 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+//  This file tells Drill to consider this module when class path scanning.
+//  This file can also include any supplementary configuration information.
+//  This file is in HOCON format, see https://github.com/typesafehub/config/blob/master/HOCON.md for more information.
+drill: {
+  memory: {
+    debug.error_on_leak: true,
+    top.max: 1000000000000
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/test/java/org/apache/arrow/memory/TestAccountant.java
----------------------------------------------------------------------
diff --git a/java/memory/src/test/java/org/apache/arrow/memory/TestAccountant.java b/java/memory/src/test/java/org/apache/arrow/memory/TestAccountant.java
new file mode 100644
index 0000000..86bccf5
--- /dev/null
+++ b/java/memory/src/test/java/org/apache/arrow/memory/TestAccountant.java
@@ -0,0 +1,164 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory;
+
+import static org.junit.Assert.assertEquals;
+
+import org.apache.arrow.memory.Accountant;
+import org.apache.arrow.memory.Accountant.AllocationOutcome;
+import org.junit.Assert;
+import org.junit.Test;
+
+public class TestAccountant {
+
+  @Test
+  public void basic() {
+    ensureAccurateReservations(null);
+  }
+
+  @Test
+  public void nested() {
+    final Accountant parent = new Accountant(null, 0, Long.MAX_VALUE);
+    ensureAccurateReservations(parent);
+    assertEquals(0, parent.getAllocatedMemory());
+  }
+
+  @Test
+  public void multiThread() throws InterruptedException {
+    final Accountant parent = new Accountant(null, 0, Long.MAX_VALUE);
+
+    final int numberOfThreads = 32;
+    final int loops = 100;
+    Thread[] threads = new Thread[numberOfThreads];
+
+    for (int i = 0; i < numberOfThreads; i++) {
+      Thread t = new Thread() {
+
+        @Override
+        public void run() {
+          try {
+            for (int i = 0; i < loops; i++) {
+              ensureAccurateReservations(parent);
+            }
+          } catch (Exception ex) {
+            ex.printStackTrace();
+            Assert.fail(ex.getMessage());
+          }
+        }
+
+      };
+      threads[i] = t;
+      t.start();
+    }
+
+    for (Thread thread : threads) {
+      thread.join();
+    }
+
+    assertEquals(0, parent.getAllocatedMemory());
+  }
+
+  private void ensureAccurateReservations(Accountant outsideParent) {
+    final Accountant parent = new Accountant(outsideParent, 0, 10);
+    assertEquals(0, parent.getAllocatedMemory());
+
+    final Accountant child = new Accountant(parent, 2, Long.MAX_VALUE);
+    assertEquals(2, parent.getAllocatedMemory());
+
+    {
+      AllocationOutcome first = child.allocateBytes(1);
+      assertEquals(AllocationOutcome.SUCCESS, first);
+    }
+
+    // child will have new allocation
+    assertEquals(1, child.getAllocatedMemory());
+
+    // root has no change since within reservation
+    assertEquals(2, parent.getAllocatedMemory());
+
+    {
+      AllocationOutcome first = child.allocateBytes(1);
+      assertEquals(AllocationOutcome.SUCCESS, first);
+    }
+
+    // child will have new allocation
+    assertEquals(2, child.getAllocatedMemory());
+
+    // root has no change since within reservation
+    assertEquals(2, parent.getAllocatedMemory());
+
+    child.releaseBytes(1);
+
+    // child will have new allocation
+    assertEquals(1, child.getAllocatedMemory());
+
+    // root has no change since within reservation
+    assertEquals(2, parent.getAllocatedMemory());
+
+    {
+      AllocationOutcome first = child.allocateBytes(2);
+      assertEquals(AllocationOutcome.SUCCESS, first);
+    }
+
+    // child will have new allocation
+    assertEquals(3, child.getAllocatedMemory());
+
+    // went beyond reservation, now in parent accountant
+    assertEquals(3, parent.getAllocatedMemory());
+
+    {
+      AllocationOutcome first = child.allocateBytes(7);
+      assertEquals(AllocationOutcome.SUCCESS, first);
+    }
+
+    // child will have new allocation
+    assertEquals(10, child.getAllocatedMemory());
+
+    // went beyond reservation, now in parent accountant
+    assertEquals(10, parent.getAllocatedMemory());
+
+    child.releaseBytes(9);
+
+    assertEquals(1, child.getAllocatedMemory());
+
+    // back to reservation size
+    assertEquals(2, parent.getAllocatedMemory());
+
+    AllocationOutcome first = child.allocateBytes(10);
+    assertEquals(AllocationOutcome.FAILED_PARENT, first);
+
+    // unchanged
+    assertEquals(1, child.getAllocatedMemory());
+    assertEquals(2, parent.getAllocatedMemory());
+
+    boolean withinLimit = child.forceAllocate(10);
+    assertEquals(false, withinLimit);
+
+    // at new limit
+    assertEquals(child.getAllocatedMemory(), 11);
+    assertEquals(parent.getAllocatedMemory(), 11);
+
+
+    child.releaseBytes(11);
+    assertEquals(child.getAllocatedMemory(), 0);
+    assertEquals(parent.getAllocatedMemory(), 2);
+
+    child.close();
+    parent.close();
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/test/java/org/apache/arrow/memory/TestBaseAllocator.java
----------------------------------------------------------------------
diff --git a/java/memory/src/test/java/org/apache/arrow/memory/TestBaseAllocator.java b/java/memory/src/test/java/org/apache/arrow/memory/TestBaseAllocator.java
new file mode 100644
index 0000000..e13dabb
--- /dev/null
+++ b/java/memory/src/test/java/org/apache/arrow/memory/TestBaseAllocator.java
@@ -0,0 +1,648 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertNotEquals;
+import static org.junit.Assert.assertNotNull;
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+import io.netty.buffer.ArrowBuf;
+import io.netty.buffer.ArrowBuf.TransferResult;
+
+import org.apache.arrow.memory.AllocationReservation;
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.memory.OutOfMemoryException;
+import org.apache.arrow.memory.RootAllocator;
+import org.junit.Ignore;
+import org.junit.Test;
+
+public class TestBaseAllocator {
+  // private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(TestBaseAllocator.class);
+
+  private final static int MAX_ALLOCATION = 8 * 1024;
+
+/*
+  // ---------------------------------------- DEBUG -----------------------------------
+
+  @After
+  public void checkBuffers() {
+    final int bufferCount = UnsafeDirectLittleEndian.getBufferCount();
+    if (bufferCount != 0) {
+      UnsafeDirectLittleEndian.logBuffers(logger);
+      UnsafeDirectLittleEndian.releaseBuffers();
+    }
+
+    assertEquals(0, bufferCount);
+  }
+
+//  @AfterClass
+//  public static void dumpBuffers() {
+//    UnsafeDirectLittleEndian.logBuffers(logger);
+//  }
+
+  // ---------------------------------------- DEBUG ------------------------------------
+*/
+
+
+  @Test
+  public void test_privateMax() throws Exception {
+    try(final RootAllocator rootAllocator =
+        new RootAllocator(MAX_ALLOCATION)) {
+      final ArrowBuf drillBuf1 = rootAllocator.buffer(MAX_ALLOCATION / 2);
+      assertNotNull("allocation failed", drillBuf1);
+
+      try(final BufferAllocator childAllocator =
+          rootAllocator.newChildAllocator("noLimits", 0, MAX_ALLOCATION)) {
+        final ArrowBuf drillBuf2 = childAllocator.buffer(MAX_ALLOCATION / 2);
+        assertNotNull("allocation failed", drillBuf2);
+        drillBuf2.release();
+      }
+
+      drillBuf1.release();
+    }
+  }
+
+  @Test(expected=IllegalStateException.class)
+  public void testRootAllocator_closeWithOutstanding() throws Exception {
+    try {
+      try(final RootAllocator rootAllocator =
+          new RootAllocator(MAX_ALLOCATION)) {
+        final ArrowBuf drillBuf = rootAllocator.buffer(512);
+        assertNotNull("allocation failed", drillBuf);
+      }
+    } finally {
+      /*
+       * We expect there to be one unreleased underlying buffer because we're closing
+       * without releasing it.
+       */
+/*
+      // ------------------------------- DEBUG ---------------------------------
+      final int bufferCount = UnsafeDirectLittleEndian.getBufferCount();
+      UnsafeDirectLittleEndian.releaseBuffers();
+      assertEquals(1, bufferCount);
+      // ------------------------------- DEBUG ---------------------------------
+*/
+    }
+  }
+
+  @Test
+  public void testRootAllocator_getEmpty() throws Exception {
+    try(final RootAllocator rootAllocator =
+        new RootAllocator(MAX_ALLOCATION)) {
+      final ArrowBuf drillBuf = rootAllocator.buffer(0);
+      assertNotNull("allocation failed", drillBuf);
+      assertEquals("capacity was non-zero", 0, drillBuf.capacity());
+      drillBuf.release();
+    }
+  }
+
+  @Ignore // TODO(DRILL-2740)
+  @Test(expected = IllegalStateException.class)
+  public void testAllocator_unreleasedEmpty() throws Exception {
+    try(final RootAllocator rootAllocator =
+        new RootAllocator(MAX_ALLOCATION)) {
+      @SuppressWarnings("unused")
+      final ArrowBuf drillBuf = rootAllocator.buffer(0);
+    }
+  }
+
+  @Test
+  public void testAllocator_transferOwnership() throws Exception {
+    try(final RootAllocator rootAllocator =
+        new RootAllocator(MAX_ALLOCATION)) {
+      final BufferAllocator childAllocator1 =
+          rootAllocator.newChildAllocator("changeOwnership1", 0, MAX_ALLOCATION);
+      final BufferAllocator childAllocator2 =
+          rootAllocator.newChildAllocator("changeOwnership2", 0, MAX_ALLOCATION);
+
+      final ArrowBuf drillBuf1 = childAllocator1.buffer(MAX_ALLOCATION / 4);
+      rootAllocator.verify();
+      TransferResult transferOwnership = drillBuf1.transferOwnership(childAllocator2);
+      final boolean allocationFit = transferOwnership.allocationFit;
+      rootAllocator.verify();
+      assertTrue(allocationFit);
+
+      drillBuf1.release();
+      childAllocator1.close();
+      rootAllocator.verify();
+
+      transferOwnership.buffer.release();
+      childAllocator2.close();
+    }
+  }
+
+  @Test
+  public void testAllocator_shareOwnership() throws Exception {
+    try (final RootAllocator rootAllocator = new RootAllocator(MAX_ALLOCATION)) {
+      final BufferAllocator childAllocator1 = rootAllocator.newChildAllocator("shareOwnership1", 0, MAX_ALLOCATION);
+      final BufferAllocator childAllocator2 = rootAllocator.newChildAllocator("shareOwnership2", 0, MAX_ALLOCATION);
+      final ArrowBuf drillBuf1 = childAllocator1.buffer(MAX_ALLOCATION / 4);
+      rootAllocator.verify();
+
+      // share ownership of buffer.
+      final ArrowBuf drillBuf2 = drillBuf1.retain(childAllocator2);
+      rootAllocator.verify();
+      assertNotNull(drillBuf2);
+      assertNotEquals(drillBuf2, drillBuf1);
+
+      // release original buffer (thus transferring ownership to allocator 2. (should leave allocator 1 in empty state)
+      drillBuf1.release();
+      rootAllocator.verify();
+      childAllocator1.close();
+      rootAllocator.verify();
+
+      final BufferAllocator childAllocator3 = rootAllocator.newChildAllocator("shareOwnership3", 0, MAX_ALLOCATION);
+      final ArrowBuf drillBuf3 = drillBuf1.retain(childAllocator3);
+      assertNotNull(drillBuf3);
+      assertNotEquals(drillBuf3, drillBuf1);
+      assertNotEquals(drillBuf3, drillBuf2);
+      rootAllocator.verify();
+
+      drillBuf2.release();
+      rootAllocator.verify();
+      childAllocator2.close();
+      rootAllocator.verify();
+
+      drillBuf3.release();
+      rootAllocator.verify();
+      childAllocator3.close();
+    }
+  }
+
+  @Test
+  public void testRootAllocator_createChildAndUse() throws Exception {
+    try (final RootAllocator rootAllocator = new RootAllocator(MAX_ALLOCATION)) {
+      try (final BufferAllocator childAllocator = rootAllocator.newChildAllocator("createChildAndUse", 0,
+          MAX_ALLOCATION)) {
+        final ArrowBuf drillBuf = childAllocator.buffer(512);
+        assertNotNull("allocation failed", drillBuf);
+        drillBuf.release();
+      }
+    }
+  }
+
+  @Test(expected=IllegalStateException.class)
+  public void testRootAllocator_createChildDontClose() throws Exception {
+    try {
+      try (final RootAllocator rootAllocator = new RootAllocator(MAX_ALLOCATION)) {
+        final BufferAllocator childAllocator = rootAllocator.newChildAllocator("createChildDontClose", 0,
+            MAX_ALLOCATION);
+        final ArrowBuf drillBuf = childAllocator.buffer(512);
+        assertNotNull("allocation failed", drillBuf);
+      }
+    } finally {
+      /*
+       * We expect one underlying buffer because we closed a child allocator without
+       * releasing the buffer allocated from it.
+       */
+/*
+      // ------------------------------- DEBUG ---------------------------------
+      final int bufferCount = UnsafeDirectLittleEndian.getBufferCount();
+      UnsafeDirectLittleEndian.releaseBuffers();
+      assertEquals(1, bufferCount);
+      // ------------------------------- DEBUG ---------------------------------
+*/
+    }
+  }
+
+  private static void allocateAndFree(final BufferAllocator allocator) {
+    final ArrowBuf drillBuf = allocator.buffer(512);
+    assertNotNull("allocation failed", drillBuf);
+    drillBuf.release();
+
+    final ArrowBuf drillBuf2 = allocator.buffer(MAX_ALLOCATION);
+    assertNotNull("allocation failed", drillBuf2);
+    drillBuf2.release();
+
+    final int nBufs = 8;
+    final ArrowBuf[] drillBufs = new ArrowBuf[nBufs];
+    for(int i = 0; i < drillBufs.length; ++i) {
+      ArrowBuf drillBufi = allocator.buffer(MAX_ALLOCATION / nBufs);
+      assertNotNull("allocation failed", drillBufi);
+      drillBufs[i] = drillBufi;
+    }
+    for(ArrowBuf drillBufi : drillBufs) {
+      drillBufi.release();
+    }
+  }
+
+  @Test
+  public void testAllocator_manyAllocations() throws Exception {
+    try(final RootAllocator rootAllocator =
+        new RootAllocator(MAX_ALLOCATION)) {
+      try(final BufferAllocator childAllocator =
+          rootAllocator.newChildAllocator("manyAllocations", 0, MAX_ALLOCATION)) {
+        allocateAndFree(childAllocator);
+      }
+    }
+  }
+
+  @Test
+  public void testAllocator_overAllocate() throws Exception {
+    try(final RootAllocator rootAllocator =
+        new RootAllocator(MAX_ALLOCATION)) {
+      try(final BufferAllocator childAllocator =
+          rootAllocator.newChildAllocator("overAllocate", 0, MAX_ALLOCATION)) {
+        allocateAndFree(childAllocator);
+
+        try {
+          childAllocator.buffer(MAX_ALLOCATION + 1);
+          fail("allocated memory beyond max allowed");
+        } catch (OutOfMemoryException e) {
+          // expected
+        }
+      }
+    }
+  }
+
+  @Test
+  public void testAllocator_overAllocateParent() throws Exception {
+    try(final RootAllocator rootAllocator =
+        new RootAllocator(MAX_ALLOCATION)) {
+      try(final BufferAllocator childAllocator =
+          rootAllocator.newChildAllocator("overAllocateParent", 0, MAX_ALLOCATION)) {
+        final ArrowBuf drillBuf1 = rootAllocator.buffer(MAX_ALLOCATION / 2);
+        assertNotNull("allocation failed", drillBuf1);
+        final ArrowBuf drillBuf2 = childAllocator.buffer(MAX_ALLOCATION / 2);
+        assertNotNull("allocation failed", drillBuf2);
+
+        try {
+          childAllocator.buffer(MAX_ALLOCATION / 4);
+          fail("allocated memory beyond max allowed");
+        } catch (OutOfMemoryException e) {
+          // expected
+        }
+
+        drillBuf1.release();
+        drillBuf2.release();
+      }
+    }
+  }
+
+  private static void testAllocator_sliceUpBufferAndRelease(
+      final RootAllocator rootAllocator, final BufferAllocator bufferAllocator) {
+    final ArrowBuf drillBuf1 = bufferAllocator.buffer(MAX_ALLOCATION / 2);
+    rootAllocator.verify();
+
+    final ArrowBuf drillBuf2 = drillBuf1.slice(16, drillBuf1.capacity() - 32);
+    rootAllocator.verify();
+    final ArrowBuf drillBuf3 = drillBuf2.slice(16, drillBuf2.capacity() - 32);
+    rootAllocator.verify();
+    @SuppressWarnings("unused")
+    final ArrowBuf drillBuf4 = drillBuf3.slice(16, drillBuf3.capacity() - 32);
+    rootAllocator.verify();
+
+    drillBuf3.release(); // since they share refcounts, one is enough to release them all
+    rootAllocator.verify();
+  }
+
+  @Test
+  public void testAllocator_createSlices() throws Exception {
+    try (final RootAllocator rootAllocator = new RootAllocator(MAX_ALLOCATION)) {
+      testAllocator_sliceUpBufferAndRelease(rootAllocator, rootAllocator);
+
+      try (final BufferAllocator childAllocator = rootAllocator.newChildAllocator("createSlices", 0, MAX_ALLOCATION)) {
+        testAllocator_sliceUpBufferAndRelease(rootAllocator, childAllocator);
+      }
+      rootAllocator.verify();
+
+      testAllocator_sliceUpBufferAndRelease(rootAllocator, rootAllocator);
+
+      try (final BufferAllocator childAllocator = rootAllocator.newChildAllocator("createSlices", 0, MAX_ALLOCATION)) {
+        try (final BufferAllocator childAllocator2 =
+            childAllocator.newChildAllocator("createSlices", 0, MAX_ALLOCATION)) {
+          final ArrowBuf drillBuf1 = childAllocator2.buffer(MAX_ALLOCATION / 8);
+          @SuppressWarnings("unused")
+          final ArrowBuf drillBuf2 = drillBuf1.slice(MAX_ALLOCATION / 16, MAX_ALLOCATION / 16);
+          testAllocator_sliceUpBufferAndRelease(rootAllocator, childAllocator);
+          drillBuf1.release();
+          rootAllocator.verify();
+        }
+        rootAllocator.verify();
+
+        testAllocator_sliceUpBufferAndRelease(rootAllocator, childAllocator);
+      }
+      rootAllocator.verify();
+    }
+  }
+
+  @Test
+  public void testAllocator_sliceRanges() throws Exception {
+//    final AllocatorOwner allocatorOwner = new NamedOwner("sliceRanges");
+    try(final RootAllocator rootAllocator =
+        new RootAllocator(MAX_ALLOCATION)) {
+      // Populate a buffer with byte values corresponding to their indices.
+      final ArrowBuf drillBuf = rootAllocator.buffer(256);
+      assertEquals(256, drillBuf.capacity());
+      assertEquals(0, drillBuf.readerIndex());
+      assertEquals(0, drillBuf.readableBytes());
+      assertEquals(0, drillBuf.writerIndex());
+      assertEquals(256, drillBuf.writableBytes());
+
+      final ArrowBuf slice3 = (ArrowBuf) drillBuf.slice();
+      assertEquals(0, slice3.readerIndex());
+      assertEquals(0, slice3.readableBytes());
+      assertEquals(0, slice3.writerIndex());
+//      assertEquals(256, slice3.capacity());
+//      assertEquals(256, slice3.writableBytes());
+
+      for(int i = 0; i < 256; ++i) {
+        drillBuf.writeByte(i);
+      }
+      assertEquals(0, drillBuf.readerIndex());
+      assertEquals(256, drillBuf.readableBytes());
+      assertEquals(256, drillBuf.writerIndex());
+      assertEquals(0, drillBuf.writableBytes());
+
+      final ArrowBuf slice1 = (ArrowBuf) drillBuf.slice();
+      assertEquals(0, slice1.readerIndex());
+      assertEquals(256, slice1.readableBytes());
+      for(int i = 0; i < 10; ++i) {
+        assertEquals(i, slice1.readByte());
+      }
+      assertEquals(256 - 10, slice1.readableBytes());
+      for(int i = 0; i < 256; ++i) {
+        assertEquals((byte) i, slice1.getByte(i));
+      }
+
+      final ArrowBuf slice2 = (ArrowBuf) drillBuf.slice(25, 25);
+      assertEquals(0, slice2.readerIndex());
+      assertEquals(25, slice2.readableBytes());
+      for(int i = 25; i < 50; ++i) {
+        assertEquals(i, slice2.readByte());
+      }
+
+/*
+      for(int i = 256; i > 0; --i) {
+        slice3.writeByte(i - 1);
+      }
+      for(int i = 0; i < 256; ++i) {
+        assertEquals(255 - i, slice1.getByte(i));
+      }
+*/
+
+      drillBuf.release(); // all the derived buffers share this fate
+    }
+  }
+
+  @Test
+  public void testAllocator_slicesOfSlices() throws Exception {
+//    final AllocatorOwner allocatorOwner = new NamedOwner("slicesOfSlices");
+    try(final RootAllocator rootAllocator =
+        new RootAllocator(MAX_ALLOCATION)) {
+      // Populate a buffer with byte values corresponding to their indices.
+      final ArrowBuf drillBuf = rootAllocator.buffer(256);
+      for(int i = 0; i < 256; ++i) {
+        drillBuf.writeByte(i);
+      }
+
+      // Slice it up.
+      final ArrowBuf slice0 = drillBuf.slice(0, drillBuf.capacity());
+      for(int i = 0; i < 256; ++i) {
+        assertEquals((byte) i, drillBuf.getByte(i));
+      }
+
+      final ArrowBuf slice10 = slice0.slice(10, drillBuf.capacity() - 10);
+      for(int i = 10; i < 256; ++i) {
+        assertEquals((byte) i, slice10.getByte(i - 10));
+      }
+
+      final ArrowBuf slice20 = slice10.slice(10, drillBuf.capacity() - 20);
+      for(int i = 20; i < 256; ++i) {
+        assertEquals((byte) i, slice20.getByte(i - 20));
+      }
+
+      final ArrowBuf slice30 = slice20.slice(10,  drillBuf.capacity() - 30);
+      for(int i = 30; i < 256; ++i) {
+        assertEquals((byte) i, slice30.getByte(i - 30));
+      }
+
+      drillBuf.release();
+    }
+  }
+
+  @Test
+  public void testAllocator_transferSliced() throws Exception {
+    try (final RootAllocator rootAllocator = new RootAllocator(MAX_ALLOCATION)) {
+      final BufferAllocator childAllocator1 = rootAllocator.newChildAllocator("transferSliced1", 0, MAX_ALLOCATION);
+      final BufferAllocator childAllocator2 = rootAllocator.newChildAllocator("transferSliced2", 0, MAX_ALLOCATION);
+
+      final ArrowBuf drillBuf1 = childAllocator1.buffer(MAX_ALLOCATION / 8);
+      final ArrowBuf drillBuf2 = childAllocator2.buffer(MAX_ALLOCATION / 8);
+
+      final ArrowBuf drillBuf1s = drillBuf1.slice(0, drillBuf1.capacity() / 2);
+      final ArrowBuf drillBuf2s = drillBuf2.slice(0, drillBuf2.capacity() / 2);
+
+      rootAllocator.verify();
+
+      TransferResult result1 = drillBuf2s.transferOwnership(childAllocator1);
+      rootAllocator.verify();
+      TransferResult result2 = drillBuf1s.transferOwnership(childAllocator2);
+      rootAllocator.verify();
+
+      result1.buffer.release();
+      result2.buffer.release();
+
+      drillBuf1s.release(); // releases drillBuf1
+      drillBuf2s.release(); // releases drillBuf2
+
+      childAllocator1.close();
+      childAllocator2.close();
+    }
+  }
+
+  @Test
+  public void testAllocator_shareSliced() throws Exception {
+    try (final RootAllocator rootAllocator = new RootAllocator(MAX_ALLOCATION)) {
+      final BufferAllocator childAllocator1 = rootAllocator.newChildAllocator("transferSliced", 0, MAX_ALLOCATION);
+      final BufferAllocator childAllocator2 = rootAllocator.newChildAllocator("transferSliced", 0, MAX_ALLOCATION);
+
+      final ArrowBuf drillBuf1 = childAllocator1.buffer(MAX_ALLOCATION / 8);
+      final ArrowBuf drillBuf2 = childAllocator2.buffer(MAX_ALLOCATION / 8);
+
+      final ArrowBuf drillBuf1s = drillBuf1.slice(0, drillBuf1.capacity() / 2);
+      final ArrowBuf drillBuf2s = drillBuf2.slice(0, drillBuf2.capacity() / 2);
+
+      rootAllocator.verify();
+
+      final ArrowBuf drillBuf2s1 = drillBuf2s.retain(childAllocator1);
+      final ArrowBuf drillBuf1s2 = drillBuf1s.retain(childAllocator2);
+      rootAllocator.verify();
+
+      drillBuf1s.release(); // releases drillBuf1
+      drillBuf2s.release(); // releases drillBuf2
+      rootAllocator.verify();
+
+      drillBuf2s1.release(); // releases the shared drillBuf2 slice
+      drillBuf1s2.release(); // releases the shared drillBuf1 slice
+
+      childAllocator1.close();
+      childAllocator2.close();
+    }
+  }
+
+  @Test
+  public void testAllocator_transferShared() throws Exception {
+    try (final RootAllocator rootAllocator = new RootAllocator(MAX_ALLOCATION)) {
+      final BufferAllocator childAllocator1 = rootAllocator.newChildAllocator("transferShared1", 0, MAX_ALLOCATION);
+      final BufferAllocator childAllocator2 = rootAllocator.newChildAllocator("transferShared2", 0, MAX_ALLOCATION);
+      final BufferAllocator childAllocator3 = rootAllocator.newChildAllocator("transferShared3", 0, MAX_ALLOCATION);
+
+      final ArrowBuf drillBuf1 = childAllocator1.buffer(MAX_ALLOCATION / 8);
+
+      boolean allocationFit;
+
+      ArrowBuf drillBuf2 = drillBuf1.retain(childAllocator2);
+      rootAllocator.verify();
+      assertNotNull(drillBuf2);
+      assertNotEquals(drillBuf2, drillBuf1);
+
+      TransferResult result = drillBuf1.transferOwnership(childAllocator3);
+      allocationFit = result.allocationFit;
+      final ArrowBuf drillBuf3 = result.buffer;
+      assertTrue(allocationFit);
+      rootAllocator.verify();
+
+      // Since childAllocator3 now has childAllocator1's buffer, 1, can close
+      drillBuf1.release();
+      childAllocator1.close();
+      rootAllocator.verify();
+
+      drillBuf2.release();
+      childAllocator2.close();
+      rootAllocator.verify();
+
+      final BufferAllocator childAllocator4 = rootAllocator.newChildAllocator("transferShared4", 0, MAX_ALLOCATION);
+      TransferResult result2 = drillBuf3.transferOwnership(childAllocator4);
+      allocationFit = result.allocationFit;
+      final ArrowBuf drillBuf4 = result2.buffer;
+      assertTrue(allocationFit);
+      rootAllocator.verify();
+
+      drillBuf3.release();
+      childAllocator3.close();
+      rootAllocator.verify();
+
+      drillBuf4.release();
+      childAllocator4.close();
+      rootAllocator.verify();
+    }
+  }
+
+  @Test
+  public void testAllocator_unclaimedReservation() throws Exception {
+    try (final RootAllocator rootAllocator = new RootAllocator(MAX_ALLOCATION)) {
+      try (final BufferAllocator childAllocator1 =
+          rootAllocator.newChildAllocator("unclaimedReservation", 0, MAX_ALLOCATION)) {
+        try(final AllocationReservation reservation = childAllocator1.newReservation()) {
+          assertTrue(reservation.add(64));
+        }
+        rootAllocator.verify();
+      }
+    }
+  }
+
+  @Test
+  public void testAllocator_claimedReservation() throws Exception {
+    try (final RootAllocator rootAllocator = new RootAllocator(MAX_ALLOCATION)) {
+
+      try (final BufferAllocator childAllocator1 = rootAllocator.newChildAllocator("claimedReservation", 0,
+          MAX_ALLOCATION)) {
+
+        try (final AllocationReservation reservation = childAllocator1.newReservation()) {
+          assertTrue(reservation.add(32));
+          assertTrue(reservation.add(32));
+
+          final ArrowBuf drillBuf = reservation.allocateBuffer();
+          assertEquals(64, drillBuf.capacity());
+          rootAllocator.verify();
+
+          drillBuf.release();
+          rootAllocator.verify();
+        }
+        rootAllocator.verify();
+      }
+    }
+  }
+
+  @Test
+  public void multiple() throws Exception {
+    final String owner = "test";
+    try (RootAllocator allocator = new RootAllocator(Long.MAX_VALUE)) {
+
+      final int op = 100000;
+
+      BufferAllocator frag1 = allocator.newChildAllocator(owner, 1500000, Long.MAX_VALUE);
+      BufferAllocator frag2 = allocator.newChildAllocator(owner, 500000, Long.MAX_VALUE);
+
+      allocator.verify();
+
+      BufferAllocator allocator11 = frag1.newChildAllocator(owner, op, Long.MAX_VALUE);
+      ArrowBuf b11 = allocator11.buffer(1000000);
+
+      allocator.verify();
+
+      BufferAllocator allocator12 = frag1.newChildAllocator(owner, op, Long.MAX_VALUE);
+      ArrowBuf b12 = allocator12.buffer(500000);
+
+      allocator.verify();
+
+      BufferAllocator allocator21 = frag1.newChildAllocator(owner, op, Long.MAX_VALUE);
+
+      allocator.verify();
+
+      BufferAllocator allocator22 = frag2.newChildAllocator(owner, op, Long.MAX_VALUE);
+      ArrowBuf b22 = allocator22.buffer(2000000);
+
+      allocator.verify();
+
+      BufferAllocator frag3 = allocator.newChildAllocator(owner, 1000000, Long.MAX_VALUE);
+
+      allocator.verify();
+
+      BufferAllocator allocator31 = frag3.newChildAllocator(owner, op, Long.MAX_VALUE);
+      ArrowBuf b31a = allocator31.buffer(200000);
+
+      allocator.verify();
+
+      // Previously running operator completes
+      b22.release();
+
+      allocator.verify();
+
+      allocator22.close();
+
+      b31a.release();
+      allocator31.close();
+
+      b12.release();
+      allocator12.close();
+
+      allocator21.close();
+
+      b11.release();
+      allocator11.close();
+
+      frag1.close();
+      frag2.close();
+      frag3.close();
+
+    }
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/memory/src/test/java/org/apache/arrow/memory/TestEndianess.java
----------------------------------------------------------------------
diff --git a/java/memory/src/test/java/org/apache/arrow/memory/TestEndianess.java b/java/memory/src/test/java/org/apache/arrow/memory/TestEndianess.java
new file mode 100644
index 0000000..25357dc
--- /dev/null
+++ b/java/memory/src/test/java/org/apache/arrow/memory/TestEndianess.java
@@ -0,0 +1,43 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.memory;
+
+import static org.junit.Assert.assertEquals;
+import io.netty.buffer.ByteBuf;
+
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.memory.RootAllocator;
+import org.junit.Test;
+
+
+public class TestEndianess {
+
+  @Test
+  public void testLittleEndian() {
+    final BufferAllocator a = new RootAllocator(10000);
+    final ByteBuf b = a.buffer(4);
+    b.setInt(0, 35);
+    assertEquals(b.getByte(0), 35);
+    assertEquals(b.getByte(1), 0);
+    assertEquals(b.getByte(2), 0);
+    assertEquals(b.getByte(3), 0);
+    b.release();
+    a.close();
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/pom.xml
----------------------------------------------------------------------
diff --git a/java/pom.xml b/java/pom.xml
new file mode 100644
index 0000000..8a3b192
--- /dev/null
+++ b/java/pom.xml
@@ -0,0 +1,470 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor
+  license agreements. See the NOTICE file distributed with this work for additional
+  information regarding copyright ownership. The ASF licenses this file to
+  You under the Apache License, Version 2.0 (the "License"); you may not use
+  this file except in compliance with the License. You may obtain a copy of
+  the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required
+  by applicable law or agreed to in writing, software distributed under the
+  License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS
+  OF ANY KIND, either express or implied. See the License for the specific
+  language governing permissions and limitations under the License. -->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+  <modelVersion>4.0.0</modelVersion>
+
+  <parent>
+    <groupId>org.apache</groupId>
+    <artifactId>apache</artifactId>
+    <version>14</version>
+  </parent>
+
+  <groupId>org.apache.arrow</groupId>
+  <artifactId>arrow-java-root</artifactId>
+  <version>0.1-SNAPSHOT</version>
+  <packaging>pom</packaging>
+
+  <name>Apache Arrow Java Root POM</name>
+  <description>Apache arrow is an open source, low latency SQL query engine for Hadoop and NoSQL.</description>
+  <url>http://arrow.apache.org/</url>
+
+  <properties>
+    <target.gen.source.path>${project.basedir}/target/generated-sources</target.gen.source.path>
+    <dep.junit.version>4.11</dep.junit.version>
+    <dep.slf4j.version>1.7.6</dep.slf4j.version>
+    <dep.guava.version>18.0</dep.guava.version>
+    <forkCount>2</forkCount>
+    <jackson.version>2.7.1</jackson.version>
+    <hadoop.version>2.7.1</hadoop.version>
+    <fmpp.version>0.9.15</fmpp.version>
+    <freemarker.version>2.3.21</freemarker.version>
+  </properties>
+
+  <scm>
+    <connection>scm:git:https://git-wip-us.apache.org/repos/asf/arrow.git</connection>
+    <developerConnection>scm:git:https://git-wip-us.apache.org/repos/asf/arrow.git</developerConnection>
+    <url>https://github.com/apache/arrow</url>
+    <tag>HEAD</tag>
+  </scm>
+
+  <mailingLists>
+    <mailingList>
+      <name>Developer List</name>
+      <subscribe>dev-subscribe@arrow.apache.org</subscribe>
+      <unsubscribe>dev-unsubscribe@arrow.apache.org</unsubscribe>
+      <post>dev@arrow.apache.org</post>
+      <archive>http://mail-archives.apache.org/mod_mbox/arrow-dev/</archive>
+    </mailingList>
+    <mailingList>
+      <name>Commits List</name>
+      <subscribe>commits-subscribe@arrow.apache.org</subscribe>
+      <unsubscribe>commits-unsubscribe@arrow.apache.org</unsubscribe>
+      <post>commits@arrow.apache.org</post>
+      <archive>http://mail-archives.apache.org/mod_mbox/arrow-commits/</archive>
+    </mailingList>
+    <mailingList>
+      <name>Issues List</name>
+      <subscribe>issues-subscribe@arrow.apache.org</subscribe>
+      <unsubscribe>issues-unsubscribe@arrow.apache.org</unsubscribe>
+      <archive>http://mail-archives.apache.org/mod_mbox/arrow-issues/</archive>
+    </mailingList>
+  </mailingLists>
+
+  <repositories>
+
+  </repositories>
+
+  <issueManagement>
+    <system>Jira</system>
+    <url>https://issues.apache.org/jira/browse/arrow</url>
+  </issueManagement>
+
+  <build>
+
+    <plugins>
+      <plugin>
+        <groupId>org.apache.rat</groupId>
+        <artifactId>apache-rat-plugin</artifactId>
+        <executions>
+          <execution>
+            <id>rat-checks</id>
+            <phase>validate</phase>
+            <goals>
+              <goal>check</goal>
+            </goals>
+          </execution>
+        </executions>
+        <configuration>
+          <excludeSubprojects>false</excludeSubprojects>
+          <excludes>
+            <exclude>**/*.log</exclude>
+            <exclude>**/*.css</exclude>
+            <exclude>**/*.js</exclude>
+            <exclude>**/*.md</exclude>
+            <exclude>**/*.eps</exclude>
+            <exclude>**/*.json</exclude>
+            <exclude>**/*.seq</exclude>
+            <exclude>**/*.parquet</exclude>
+            <exclude>**/*.sql</exclude>
+            <exclude>**/git.properties</exclude>
+            <exclude>**/*.csv</exclude>
+            <exclude>**/*.csvh</exclude>
+            <exclude>**/*.csvh-test</exclude>
+            <exclude>**/*.tsv</exclude>
+            <exclude>**/*.txt</exclude>
+            <exclude>**/*.ssv</exclude>
+            <exclude>**/arrow-*.conf</exclude>
+            <exclude>**/.buildpath</exclude>
+            <exclude>**/*.proto</exclude>
+            <exclude>**/*.fmpp</exclude>
+            <exclude>**/target/**</exclude>
+            <exclude>**/*.iml</exclude>
+            <exclude>**/*.tdd</exclude>
+            <exclude>**/*.project</exclude>
+            <exclude>**/TAGS</exclude>
+            <exclude>**/*.checkstyle</exclude>
+            <exclude>**/.classpath</exclude>
+            <exclude>**/.settings/**</exclude>
+            <exclude>.*/**</exclude>
+            <exclude>**/*.patch</exclude>
+            <exclude>**/*.pb.cc</exclude>
+            <exclude>**/*.pb.h</exclude>
+            <exclude>**/*.linux</exclude>
+            <exclude>**/client/build/**</exclude>
+            <exclude>**/*.tbl</exclude>
+          </excludes>
+        </configuration>
+      </plugin>
+
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-jar-plugin</artifactId>
+        <configuration>
+          <excludes>
+            <exclude>**/logging.properties</exclude>
+            <exclude>**/logback-test.xml</exclude>
+            <exclude>**/logback.out.xml</exclude>
+            <exclude>**/logback.xml</exclude>
+          </excludes>
+          <archive>
+            <index>true</index>
+            <manifest>
+              <addDefaultImplementationEntries>true</addDefaultImplementationEntries>
+              <addDefaultSpecificationEntries>true</addDefaultSpecificationEntries>
+            </manifest>
+            <manifestEntries>
+              <Extension-Name>org.apache.arrow</Extension-Name>
+              <Built-By>${username}</Built-By>
+              <url>http://arrow.apache.org/</url>
+            </manifestEntries>
+          </archive>
+        </configuration>
+        <executions>
+          <execution>
+            <goals>
+              <goal>test-jar</goal>
+            </goals>
+            <configuration>
+              <skipIfEmpty>true</skipIfEmpty>
+            </configuration>
+          </execution>
+        </executions>
+      </plugin>
+
+
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-resources-plugin</artifactId>
+        <configuration>
+          <encoding>UTF-8</encoding>
+        </configuration>
+      </plugin>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-compiler-plugin</artifactId>
+        <configuration>
+          <source>1.7</source>
+          <target>1.7</target>
+          <maxmem>2048m</maxmem>
+          <useIncrementalCompilation>false</useIncrementalCompilation>
+          <fork>true</fork>
+        </configuration>
+      </plugin>
+      <plugin>
+        <artifactId>maven-enforcer-plugin</artifactId>
+        <executions>
+          <execution>
+            <id>validate_java_and_maven_version</id>
+            <phase>verify</phase>
+            <goals>
+              <goal>enforce</goal>
+            </goals>
+            <inherited>false</inherited>
+            <configuration>
+              <rules>
+                <requireMavenVersion>
+                  <version>[3.0.4,4)</version>
+                </requireMavenVersion>
+              </rules>
+            </configuration>
+          </execution>
+          <execution>
+            <id>avoid_bad_dependencies</id>
+            <phase>verify</phase>
+            <goals>
+              <goal>enforce</goal>
+            </goals>
+            <configuration>
+              <rules>
+                <bannedDependencies>
+                  <excludes>
+                    <exclude>commons-logging</exclude>
+                    <exclude>javax.servlet:servlet-api</exclude>
+                    <exclude>org.mortbay.jetty:servlet-api</exclude>
+                    <exclude>org.mortbay.jetty:servlet-api-2.5</exclude>
+                    <exclude>log4j:log4j</exclude>
+                  </excludes>
+                </bannedDependencies>
+              </rules>
+            </configuration>
+          </execution>
+        </executions>
+      </plugin>
+      <plugin>
+        <groupId>pl.project13.maven</groupId>
+        <artifactId>git-commit-id-plugin</artifactId>
+        <version>2.1.9</version>
+        <executions>
+          <execution>
+            <id>for-jars</id>
+            <inherited>true</inherited>
+            <goals>
+              <goal>revision</goal>
+            </goals>
+            <configuration>
+              <generateGitPropertiesFilename>target/classes/git.properties</generateGitPropertiesFilename>
+            </configuration>
+          </execution>
+          <execution>
+            <id>for-source-tarball</id>
+            <goals>
+              <goal>revision</goal>
+            </goals>
+            <inherited>false</inherited>
+            <configuration>
+              <generateGitPropertiesFilename>./git.properties</generateGitPropertiesFilename>
+            </configuration>
+          </execution>
+        </executions>
+
+        <configuration>
+          <dateFormat>dd.MM.yyyy '@' HH:mm:ss z</dateFormat>
+          <verbose>true</verbose>
+          <skipPoms>false</skipPoms>
+          <generateGitPropertiesFile>true</generateGitPropertiesFile>
+          <failOnNoGitDirectory>false</failOnNoGitDirectory>
+          <gitDescribe>
+            <skip>false</skip>
+            <always>false</always>
+            <abbrev>7</abbrev>
+            <dirty>-dirty</dirty>
+            <forceLongFormat>true</forceLongFormat>
+          </gitDescribe>
+        </configuration>
+      </plugin>
+    </plugins>
+    <pluginManagement>
+
+      <plugins>
+        <plugin>
+          <groupId>org.apache.rat</groupId>
+          <artifactId>apache-rat-plugin</artifactId>
+          <version>0.11</version>
+        </plugin>
+        <plugin>
+          <groupId>org.apache.maven.plugins</groupId>
+          <artifactId>maven-resources-plugin</artifactId>
+          <version>2.6</version>
+        </plugin>
+        <plugin>
+          <groupId>org.apache.maven.plugins</groupId>
+          <artifactId>maven-compiler-plugin</artifactId>
+          <version>3.2</version>
+        </plugin>
+        <plugin>
+          <artifactId>maven-enforcer-plugin</artifactId>
+          <version>1.3.1</version>
+        </plugin>
+        <plugin>
+          <artifactId>maven-surefire-plugin</artifactId>
+          <version>2.17</version>
+          <configuration>
+            <argLine>-ea</argLine>
+            <forkCount>${forkCount}</forkCount>
+            <reuseForks>true</reuseForks>
+            <systemPropertyVariables>
+              <java.io.tmpdir>${project.build.directory}</java.io.tmpdir>
+            </systemPropertyVariables>
+          </configuration>
+        </plugin>
+        <plugin>
+          <groupId>org.apache.maven.plugins</groupId>
+          <artifactId>maven-release-plugin</artifactId>
+          <version>2.5.2</version>
+          <configuration>
+            <useReleaseProfile>false</useReleaseProfile>
+            <pushChanges>false</pushChanges>
+            <goals>deploy</goals>
+            <arguments>-Papache-release ${arguments}</arguments>
+          </configuration>
+        </plugin>
+
+        <!--This plugin's configuration is used to store Eclipse m2e settings
+          only. It has no influence on the Maven build itself. -->
+        <plugin>
+          <groupId>org.eclipse.m2e</groupId>
+          <artifactId>lifecycle-mapping</artifactId>
+          <version>1.0.0</version>
+          <configuration>
+            <lifecycleMappingMetadata>
+              <pluginExecutions>
+                <pluginExecution>
+                  <pluginExecutionFilter>
+                    <groupId>org.apache.maven.plugins</groupId>
+                    <artifactId>maven-antrun-plugin</artifactId>
+                    <versionRange>[1.6,)</versionRange>
+                    <goals>
+                      <goal>run</goal>
+                    </goals>
+                  </pluginExecutionFilter>
+                  <action>
+                    <ignore />
+                  </action>
+                </pluginExecution>
+                <pluginExecution>
+                  <pluginExecutionFilter>
+                    <groupId>org.apache.maven.plugins</groupId>
+                    <artifactId>maven-enforcer-plugin</artifactId>
+                    <versionRange>[1.2,)</versionRange>
+                    <goals>
+                      <goal>enforce</goal>
+                    </goals>
+                  </pluginExecutionFilter>
+                  <action>
+                    <ignore />
+                  </action>
+                </pluginExecution>
+                <pluginExecution>
+                  <pluginExecutionFilter>
+                    <groupId>org.apache.maven.plugins</groupId>
+                    <artifactId>
+                      maven-remote-resources-plugin
+                    </artifactId>
+                    <versionRange>[1.1,)</versionRange>
+                    <goals>
+                      <goal>process</goal>
+                    </goals>
+                  </pluginExecutionFilter>
+                  <action>
+                    <ignore />
+                  </action>
+                </pluginExecution>
+                <pluginExecution>
+                  <pluginExecutionFilter>
+                    <groupId>org.apache.rat</groupId>
+                    <artifactId>apache-rat-plugin</artifactId>
+                    <versionRange>[0.10,)</versionRange>
+                    <goals>
+                      <goal>check</goal>
+                    </goals>
+                  </pluginExecutionFilter>
+                  <action>
+                    <ignore />
+                  </action>
+                </pluginExecution>
+              </pluginExecutions>
+            </lifecycleMappingMetadata>
+          </configuration>
+        </plugin>
+      </plugins>
+    </pluginManagement>
+  </build>
+  <dependencies>
+
+    <dependency>
+      <groupId>io.netty</groupId>
+      <artifactId>netty-handler</artifactId>
+      <version>4.0.27.Final</version>
+    </dependency>
+
+    <dependency>
+      <groupId>com.google.guava</groupId>
+      <artifactId>guava</artifactId>
+      <version>${dep.guava.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>slf4j-api</artifactId>
+      <version>${dep.slf4j.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>jul-to-slf4j</artifactId>
+      <version>${dep.slf4j.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>jcl-over-slf4j</artifactId>
+      <version>${dep.slf4j.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>log4j-over-slf4j</artifactId>
+      <version>${dep.slf4j.version}</version>
+    </dependency>
+
+    <!-- Test Dependencies -->
+    <dependency>
+      <!-- JMockit needs to be on class path before JUnit. -->
+      <groupId>com.googlecode.jmockit</groupId>
+      <artifactId>jmockit</artifactId>
+      <version>1.3</version>
+      <scope>test</scope>
+    </dependency>
+    <dependency>
+      <groupId>junit</groupId>
+      <artifactId>junit</artifactId>
+      <version>${dep.junit.version}</version>
+      <scope>test</scope>
+    </dependency>
+    <dependency>
+      <!-- Mockito needs to be on the class path after JUnit (or Hamcrest) as 
+           long as Mockito _contains_ older Hamcrest classes.  See arrow-2130. --> 
+      <groupId>org.mockito</groupId>
+      <artifactId>mockito-core</artifactId>
+      <version>1.9.5</version>
+    </dependency>
+    <dependency>
+      <groupId>ch.qos.logback</groupId>
+      <artifactId>logback-classic</artifactId>
+      <version>1.0.13</version>
+      <scope>test</scope>
+    </dependency>
+    <dependency>
+      <groupId>de.huxhorn.lilith</groupId>
+      <artifactId>de.huxhorn.lilith.logback.appender.multiplex-classic</artifactId>
+      <version>0.9.44</version>
+      <scope>test</scope>
+    </dependency>
+
+  </dependencies>
+
+  <modules>
+    <module>memory</module>
+    <module>vector</module>
+  </modules>
+</project>

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/pom.xml
----------------------------------------------------------------------
diff --git a/java/vector/pom.xml b/java/vector/pom.xml
new file mode 100644
index 0000000..e693344
--- /dev/null
+++ b/java/vector/pom.xml
@@ -0,0 +1,165 @@
+<?xml version="1.0"?>
+<!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor 
+  license agreements. See the NOTICE file distributed with this work for additional 
+  information regarding copyright ownership. The ASF licenses this file to 
+  You under the Apache License, Version 2.0 (the "License"); you may not use 
+  this file except in compliance with the License. You may obtain a copy of 
+  the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required 
+  by applicable law or agreed to in writing, software distributed under the 
+  License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS 
+  OF ANY KIND, either express or implied. See the License for the specific 
+  language governing permissions and limitations under the License. -->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+  <modelVersion>4.0.0</modelVersion>
+  <parent>
+    <groupId>org.apache.arrow</groupId>
+    <artifactId>arrow-java-root</artifactId>
+    <version>0.1-SNAPSHOT</version>
+  </parent>
+  <artifactId>vector</artifactId>
+  <name>vectors</name>
+
+  <dependencies>
+
+    <dependency>
+      <groupId>org.apache.arrow</groupId>
+      <artifactId>arrow-memory</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>joda-time</groupId>
+      <artifactId>joda-time</artifactId>
+      <version>2.9</version>
+    </dependency>
+    <dependency>
+      <groupId>com.fasterxml.jackson.core</groupId>
+      <artifactId>jackson-annotations</artifactId>
+      <version>2.7.1</version>
+    </dependency>
+    <dependency>
+      <groupId>com.fasterxml.jackson.core</groupId>
+      <artifactId>jackson-databind</artifactId>
+      <version>2.7.1</version>
+    </dependency>
+    <dependency>
+      <groupId>com.carrotsearch</groupId>
+      <artifactId>hppc</artifactId>
+      <version>0.7.1</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.commons</groupId>
+      <artifactId>commons-lang3</artifactId>
+      <version>3.4</version>
+    </dependency>
+
+
+  </dependencies>
+
+    <pluginRepositories>
+        <pluginRepository>
+            <id>apache</id>
+            <name>apache</name>
+            <url>https://repo.maven.apache.org/</url>
+            <releases>
+                <enabled>true</enabled>
+            </releases>
+            <snapshots>
+                <enabled>false</enabled>
+            </snapshots>
+        </pluginRepository>
+    </pluginRepositories>  
+    
+  <build>
+
+    <resources>
+      <resource>
+        <!-- Copy freemarker template and fmpp configuration files of Vector's 
+          to allow clients to leverage definitions. -->
+        <directory>${basedir}/src/main/codegen</directory>
+        <targetPath>codegen</targetPath>
+      </resource>
+    </resources>
+
+    <plugins>
+      <plugin>
+        <artifactId>maven-resources-plugin</artifactId>
+        <executions>
+          <execution> <!-- copy all templates in the same location to compile them at once -->
+            <id>copy-fmpp-resources</id>
+            <phase>initialize</phase>
+            <goals>
+              <goal>copy-resources</goal>
+            </goals>
+            <configuration>
+              <outputDirectory>${project.build.directory}/codegen</outputDirectory>
+              <resources>
+                <resource>
+                  <directory>src/main/codegen</directory>
+                  <filtering>false</filtering>
+                </resource>
+              </resources>
+            </configuration>
+          </execution>
+        </executions>
+      </plugin>
+      <plugin> <!-- generate sources from fmpp -->
+        <groupId>org.apache.drill.tools</groupId>
+        <artifactId>drill-fmpp-maven-plugin</artifactId>
+        <version>1.4.0</version>
+        <executions>
+          <execution>
+            <id>generate-fmpp</id>
+            <phase>generate-sources</phase>
+            <goals>
+              <goal>generate</goal>
+            </goals>
+            <configuration>
+              <config>src/main/codegen/config.fmpp</config>
+              <output>${project.build.directory}/generated-sources</output>
+              <templates>${project.build.directory}/codegen/templates</templates>
+            </configuration>
+          </execution>
+        </executions>
+      </plugin>
+    </plugins>
+    <pluginManagement>
+      <plugins>
+        <!--This plugin's configuration is used to store Eclipse m2e settings 
+          only. It has no influence on the Maven build itself. -->
+        <plugin>
+          <groupId>org.eclipse.m2e</groupId>
+          <artifactId>lifecycle-mapping</artifactId>
+          <version>1.0.0</version>
+          <configuration>
+            <lifecycleMappingMetadata>
+              <pluginExecutions>
+                <pluginExecution>
+                  <pluginExecutionFilter>
+                    <groupId>org.apache.drill.tools</groupId>
+                    <artifactId>drill-fmpp-maven-plugin</artifactId>
+                    <versionRange>[1.0,)</versionRange>
+                    <goals>
+                      <goal>generate</goal>
+                    </goals>
+                  </pluginExecutionFilter>
+                  <action>
+                    <execute>
+                      <runOnIncremental>false</runOnIncremental>
+                      <runOnConfiguration>true</runOnConfiguration>
+                    </execute>
+                  </action>
+                </pluginExecution>
+              </pluginExecutions>
+            </lifecycleMappingMetadata>
+          </configuration>
+        </plugin>
+      </plugins>
+    </pluginManagement>
+    
+  
+  </build>
+
+
+
+</project>

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/config.fmpp
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/config.fmpp b/java/vector/src/main/codegen/config.fmpp
new file mode 100644
index 0000000..663677c
--- /dev/null
+++ b/java/vector/src/main/codegen/config.fmpp
@@ -0,0 +1,24 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http:# www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+data: {
+    # TODO:  Rename to ~valueVectorModesAndTypes for clarity.
+    vv:                       tdd(../data/ValueVectorTypes.tdd),
+
+}
+freemarkerLinks: {
+    includes: includes/
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/data/ValueVectorTypes.tdd
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/data/ValueVectorTypes.tdd b/java/vector/src/main/codegen/data/ValueVectorTypes.tdd
new file mode 100644
index 0000000..e747c30
--- /dev/null
+++ b/java/vector/src/main/codegen/data/ValueVectorTypes.tdd
@@ -0,0 +1,168 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http:# www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+{
+  modes: [
+    {name: "Optional", prefix: "Nullable"},
+    {name: "Required", prefix: ""},
+    {name: "Repeated", prefix: "Repeated"}
+    ],
+  types: [
+    {
+      major: "Fixed",
+      width: 1,
+      javaType: "byte",
+      boxedType: "Byte",
+      fields: [{name: "value", type: "byte"}],
+      minor: [
+        { class: "TinyInt", valueHolder: "IntHolder" },
+        { class: "UInt1", valueHolder: "UInt1Holder" }
+      ]
+    },
+    {
+      major: "Fixed",
+      width: 2,
+      javaType: "char",
+      boxedType: "Character",
+      fields: [{name: "value", type: "char"}],
+      minor: [
+        { class: "UInt2", valueHolder: "UInt2Holder"}
+      ]
+    },    {
+      major: "Fixed",
+      width: 2,
+      javaType: "short",
+      boxedType: "Short",
+      fields: [{name: "value", type: "short"}],
+      minor: [
+        { class: "SmallInt", valueHolder: "Int2Holder"},
+      ]
+    },
+    {
+      major: "Fixed",
+      width: 4,
+      javaType: "int",
+      boxedType: "Integer",
+      fields: [{name: "value", type: "int"}],
+      minor: [
+        { class: "Int", valueHolder: "IntHolder"},
+        { class: "UInt4", valueHolder: "UInt4Holder" },
+        { class: "Float4", javaType: "float" , boxedType: "Float", fields: [{name: "value", type: "float"}]},
+        { class: "Time", javaType: "int", friendlyType: "DateTime" },
+        { class: "IntervalYear", javaType: "int", friendlyType: "Period" }
+        { class: "Decimal9", maxPrecisionDigits: 9, friendlyType: "BigDecimal", fields: [{name:"value", type:"int"}, {name: "scale", type: "int", include: false}, {name: "precision", type: "int", include: false}] },
+      ]
+    },
+    {
+      major: "Fixed",
+      width: 8,
+      javaType: "long",
+      boxedType: "Long",
+      fields: [{name: "value", type: "long"}],
+      minor: [
+        { class: "BigInt"},
+        { class: "UInt8" },
+        { class: "Float8", javaType: "double" , boxedType: "Double", fields: [{name: "value", type: "double"}], },
+        { class: "Date", javaType: "long", friendlyType: "DateTime" },
+        { class: "TimeStamp", javaType: "long", friendlyType: "DateTime" }
+        { class: "Decimal18", maxPrecisionDigits: 18, friendlyType: "BigDecimal", fields: [{name:"value", type:"long"}, {name: "scale", type: "int", include: false}, {name: "precision", type: "int", include: false}] },
+        <#--
+        { class: "Money", maxPrecisionDigits: 2, scale: 1, },
+        -->
+      ]
+    },
+    {
+      major: "Fixed",
+      width: 12,
+      javaType: "ArrowBuf",
+      boxedType: "ArrowBuf",
+      minor: [
+        { class: "IntervalDay", millisecondsOffset: 4, friendlyType: "Period", fields: [ {name: "days", type:"int"}, {name: "milliseconds", type:"int"}] }
+      ]
+    },
+    {
+      major: "Fixed",
+      width: 16,
+      javaType: "ArrowBuf"
+      boxedType: "ArrowBuf",      
+      minor: [
+        { class: "Interval", daysOffset: 4, millisecondsOffset: 8, friendlyType: "Period", fields: [ {name: "months", type: "int"}, {name: "days", type:"int"}, {name: "milliseconds", type:"int"}] }
+      ]
+    },
+    {
+      major: "Fixed",
+      width: 12,
+      javaType: "ArrowBuf",
+      boxedType: "ArrowBuf",
+      minor: [
+        <#--
+        { class: "TimeTZ" },
+        { class: "Interval" }
+        -->
+        { class: "Decimal28Dense", maxPrecisionDigits: 28, nDecimalDigits: 3, friendlyType: "BigDecimal", fields: [{name: "start", type: "int"}, {name: "buffer", type: "ArrowBuf"}, {name: "scale", type: "int", include: false}, {name: "precision", type: "int", include: false}] }
+      ]
+    },
+    {
+      major: "Fixed",
+      width: 16,
+      javaType: "ArrowBuf",
+      boxedType: "ArrowBuf",
+      
+      minor: [
+        { class: "Decimal38Dense", maxPrecisionDigits: 38, nDecimalDigits: 4, friendlyType: "BigDecimal", fields: [{name: "start", type: "int"}, {name: "buffer", type: "ArrowBuf"}, {name: "scale", type: "int", include: false}, {name: "precision", type: "int", include: false}] }
+      ]
+    },
+    {
+      major: "Fixed",
+      width: 24,
+      javaType: "ArrowBuf",
+      boxedType: "ArrowBuf",
+      minor: [
+        { class: "Decimal38Sparse", maxPrecisionDigits: 38, nDecimalDigits: 6, friendlyType: "BigDecimal", fields: [{name: "start", type: "int"}, {name: "buffer", type: "ArrowBuf"}, {name: "scale", type: "int", include: false}, {name: "precision", type: "int", include: false}] }
+      ]
+    },
+    {
+      major: "Fixed",
+      width: 20,
+      javaType: "ArrowBuf",
+      boxedType: "ArrowBuf",
+      minor: [
+        { class: "Decimal28Sparse", maxPrecisionDigits: 28, nDecimalDigits: 5, friendlyType: "BigDecimal", fields: [{name: "start", type: "int"}, {name: "buffer", type: "ArrowBuf"}, {name: "scale", type: "int", include: false}, {name: "precision", type: "int", include: false}] }
+      ]
+    },
+    {
+      major: "VarLen",
+      width: 4,
+      javaType: "int",
+      boxedType: "ArrowBuf",
+      fields: [{name: "start", type: "int"}, {name: "end", type: "int"}, {name: "buffer", type: "ArrowBuf"}],
+      minor: [
+        { class: "VarBinary" , friendlyType: "byte[]" },
+        { class: "VarChar" , friendlyType: "Text" },
+        { class: "Var16Char" , friendlyType: "String" }
+      ]
+    },
+    {
+      major: "Bit",
+      width: 1,
+      javaType: "int",
+      boxedType: "Integer",
+      minor: [
+        { class: "Bit" , friendlyType: "Boolean", fields: [{name: "value", type: "int"}] }
+      ]
+    }
+  ]
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/includes/license.ftl
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/includes/license.ftl b/java/vector/src/main/codegen/includes/license.ftl
new file mode 100644
index 0000000..0455fd8
--- /dev/null
+++ b/java/vector/src/main/codegen/includes/license.ftl
@@ -0,0 +1,18 @@
+/*******************************************************************************
+
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ ******************************************************************************/
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/includes/vv_imports.ftl
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/includes/vv_imports.ftl b/java/vector/src/main/codegen/includes/vv_imports.ftl
new file mode 100644
index 0000000..2d808b1
--- /dev/null
+++ b/java/vector/src/main/codegen/includes/vv_imports.ftl
@@ -0,0 +1,62 @@
+<#-- Licensed to the Apache Software Foundation (ASF) under one or more contributor 
+  license agreements. See the NOTICE file distributed with this work for additional 
+  information regarding copyright ownership. The ASF licenses this file to 
+  You under the Apache License, Version 2.0 (the "License"); you may not use 
+  this file except in compliance with the License. You may obtain a copy of 
+  the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required 
+  by applicable law or agreed to in writing, software distributed under the 
+  License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS 
+  OF ANY KIND, either express or implied. See the License for the specific 
+  language governing permissions and limitations under the License. -->
+
+import static com.google.common.base.Preconditions.checkArgument;
+import static com.google.common.base.Preconditions.checkState;
+
+import com.google.common.collect.Lists;
+import com.google.common.collect.ObjectArrays;
+import com.google.common.base.Charsets;
+import com.google.common.collect.ObjectArrays;
+
+import com.google.common.base.Preconditions;
+import io.netty.buffer.*;
+
+import org.apache.commons.lang3.ArrayUtils;
+
+import org.apache.arrow.memory.*;
+import org.apache.arrow.vector.types.Types;
+import org.apache.arrow.vector.types.Types.*;
+import org.apache.arrow.vector.types.*;
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.holders.*;
+import org.apache.arrow.vector.util.*;
+import org.apache.arrow.vector.complex.*;
+import org.apache.arrow.vector.complex.reader.*;
+import org.apache.arrow.vector.complex.impl.*;
+import org.apache.arrow.vector.complex.writer.*;
+import org.apache.arrow.vector.complex.writer.BaseWriter.MapWriter;
+import org.apache.arrow.vector.complex.writer.BaseWriter.ListWriter;
+import org.apache.arrow.vector.util.JsonStringArrayList;
+
+import java.util.Arrays;
+import java.util.Random;
+import java.util.List;
+
+import java.io.Closeable;
+import java.io.InputStream;
+import java.io.InputStreamReader;
+import java.nio.ByteBuffer;
+
+import java.sql.Date;
+import java.sql.Time;
+import java.sql.Timestamp;
+import java.math.BigDecimal;
+import java.math.BigInteger;
+
+import org.joda.time.DateTime;
+import org.joda.time.Period;
+
+
+
+
+
+

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/AbstractFieldReader.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/AbstractFieldReader.java b/java/vector/src/main/codegen/templates/AbstractFieldReader.java
new file mode 100644
index 0000000..b83dba2
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/AbstractFieldReader.java
@@ -0,0 +1,124 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+<@pp.dropOutputFile />
+<@pp.changeOutputFile name="/org/apache/arrow/vector/complex/impl/AbstractFieldReader.java" />
+
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.complex.impl;
+
+<#include "/@includes/vv_imports.ftl" />
+
+@SuppressWarnings("unused")
+abstract class AbstractFieldReader extends AbstractBaseReader implements FieldReader{
+  
+  AbstractFieldReader(){
+    super();
+  }
+
+  /**
+   * Returns true if the current value of the reader is not null
+   * @return
+   */
+  public boolean isSet() {
+    return true;
+  }
+
+  <#list ["Object", "BigDecimal", "Integer", "Long", "Boolean", 
+          "Character", "DateTime", "Period", "Double", "Float",
+          "Text", "String", "Byte", "Short", "byte[]"] as friendlyType>
+  <#assign safeType=friendlyType />
+  <#if safeType=="byte[]"><#assign safeType="ByteArray" /></#if>
+  
+  public ${friendlyType} read${safeType}(int arrayIndex){
+    fail("read${safeType}(int arrayIndex)");
+    return null;
+  }
+  
+  public ${friendlyType} read${safeType}(){
+    fail("read${safeType}()");
+    return null;
+  }
+  
+  </#list>
+  
+  public void copyAsValue(MapWriter writer){
+    fail("CopyAsValue MapWriter");
+  }
+  public void copyAsField(String name, MapWriter writer){
+    fail("CopyAsField MapWriter");
+  }
+
+  public void copyAsField(String name, ListWriter writer){
+    fail("CopyAsFieldList");
+  }
+  
+  <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first />
+  <#assign boxedType = (minor.boxedType!type.boxedType) />
+
+  public void read(${name}Holder holder){
+    fail("${name}");
+  }
+
+  public void read(Nullable${name}Holder holder){
+    fail("${name}");
+  }
+  
+  public void read(int arrayIndex, ${name}Holder holder){
+    fail("Repeated${name}");
+  }
+  
+  public void read(int arrayIndex, Nullable${name}Holder holder){
+    fail("Repeated${name}");
+  }
+  
+  public void copyAsValue(${name}Writer writer){
+    fail("CopyAsValue${name}");
+  }
+  public void copyAsField(String name, ${name}Writer writer){
+    fail("CopyAsField${name}");
+  }
+  </#list></#list>
+  
+  public FieldReader reader(String name){
+    fail("reader(String name)");
+    return null;
+  }
+
+  public FieldReader reader(){
+    fail("reader()");
+    return null;
+    
+  }
+  
+  public int size(){
+    fail("size()");
+    return -1;
+  }
+  
+  private void fail(String name){
+    throw new IllegalArgumentException(String.format("You tried to read a [%s] type when you are using a field reader of type [%s].", name, this.getClass().getSimpleName()));
+  }
+  
+  
+}
+
+
+


[13/17] arrow git commit: ARROW-4: This provides an partial C++11 implementation of the Apache Arrow data structures along with a cmake-based build system. The codebase generally follows Google C++ style guide, but more cleaning to be more conforming is

Posted by ja...@apache.org.
http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/util/status.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/util/status.h b/cpp/src/arrow/util/status.h
new file mode 100644
index 0000000..47fda40
--- /dev/null
+++ b/cpp/src/arrow/util/status.h
@@ -0,0 +1,152 @@
+// Copyright (c) 2011 The LevelDB Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file. See the AUTHORS file for names of contributors.
+//
+// A Status encapsulates the result of an operation.  It may indicate success,
+// or it may indicate an error with an associated error message.
+//
+// Multiple threads can invoke const methods on a Status without
+// external synchronization, but if any of the threads may call a
+// non-const method, all threads accessing the same Status must use
+// external synchronization.
+
+// Adapted from Kudu github.com/cloudera/kudu
+
+#ifndef ARROW_STATUS_H_
+#define ARROW_STATUS_H_
+
+#include <cstdint>
+#include <cstring>
+#include <string>
+
+// Return the given status if it is not OK.
+#define ARROW_RETURN_NOT_OK(s) do {           \
+    ::arrow::Status _s = (s);                 \
+    if (!_s.ok()) return _s;                    \
+  } while (0);
+
+// Return the given status if it is not OK, but first clone it and
+// prepend the given message.
+#define ARROW_RETURN_NOT_OK_PREPEND(s, msg) do {                      \
+    ::arrow::Status _s = (s);                                         \
+    if (::gutil::PREDICT_FALSE(!_s.ok())) return _s.CloneAndPrepend(msg); \
+  } while (0);
+
+// Return 'to_return' if 'to_call' returns a bad status.
+// The substitution for 'to_return' may reference the variable
+// 's' for the bad status.
+#define ARROW_RETURN_NOT_OK_RET(to_call, to_return) do { \
+    ::arrow::Status s = (to_call); \
+    if (::gutil::PREDICT_FALSE(!s.ok())) return (to_return);    \
+  } while (0);
+
+// If 'to_call' returns a bad status, CHECK immediately with a logged message
+// of 'msg' followed by the status.
+#define ARROW_CHECK_OK_PREPEND(to_call, msg) do {         \
+::arrow::Status _s = (to_call);                           \
+ARROW_CHECK(_s.ok()) << (msg) << ": " << _s.ToString();   \
+} while (0);
+
+// If the status is bad, CHECK immediately, appending the status to the
+// logged message.
+#define ARROW_CHECK_OK(s) ARROW_CHECK_OK_PREPEND(s, "Bad status")
+
+namespace arrow {
+
+#define RETURN_NOT_OK(s) do {                   \
+    Status _s = (s);                            \
+    if (!_s.ok()) return _s;                    \
+  } while (0);
+
+enum class StatusCode: char {
+  OK = 0,
+  OutOfMemory = 1,
+  KeyError = 2,
+  Invalid = 3,
+
+  NotImplemented = 10,
+};
+
+class Status {
+ public:
+  // Create a success status.
+  Status() : state_(NULL) { }
+  ~Status() { delete[] state_; }
+
+  // Copy the specified status.
+  Status(const Status& s);
+  void operator=(const Status& s);
+
+  // Return a success status.
+  static Status OK() { return Status(); }
+
+  // Return error status of an appropriate type.
+  static Status OutOfMemory(const std::string& msg, int16_t posix_code = -1) {
+    return Status(StatusCode::OutOfMemory, msg, posix_code);
+  }
+
+  static Status KeyError(const std::string& msg) {
+    return Status(StatusCode::KeyError, msg, -1);
+  }
+
+  static Status NotImplemented(const std::string& msg) {
+    return Status(StatusCode::NotImplemented, msg, -1);
+  }
+
+  static Status Invalid(const std::string& msg) {
+    return Status(StatusCode::Invalid, msg, -1);
+  }
+
+  // Returns true iff the status indicates success.
+  bool ok() const { return (state_ == NULL); }
+
+  bool IsOutOfMemory() const { return code() == StatusCode::OutOfMemory; }
+  bool IsKeyError() const { return code() == StatusCode::KeyError; }
+  bool IsInvalid() const { return code() == StatusCode::Invalid; }
+
+  // Return a string representation of this status suitable for printing.
+  // Returns the string "OK" for success.
+  std::string ToString() const;
+
+  // Return a string representation of the status code, without the message
+  // text or posix code information.
+  std::string CodeAsString() const;
+
+  // Get the POSIX code associated with this Status, or -1 if there is none.
+  int16_t posix_code() const;
+
+ private:
+  // OK status has a NULL state_.  Otherwise, state_ is a new[] array
+  // of the following form:
+  //    state_[0..3] == length of message
+  //    state_[4]    == code
+  //    state_[5..6] == posix_code
+  //    state_[7..]  == message
+  const char* state_;
+
+  StatusCode code() const {
+    return ((state_ == NULL) ?
+        StatusCode::OK : static_cast<StatusCode>(state_[4]));
+  }
+
+  Status(StatusCode code, const std::string& msg, int16_t posix_code);
+  static const char* CopyState(const char* s);
+};
+
+inline Status::Status(const Status& s) {
+  state_ = (s.state_ == NULL) ? NULL : CopyState(s.state_);
+}
+
+inline void Status::operator=(const Status& s) {
+  // The following condition catches both aliasing (when this == &s),
+  // and the common case where both s and *this are ok.
+  if (state_ != s.state_) {
+    delete[] state_;
+    state_ = (s.state_ == NULL) ? NULL : CopyState(s.state_);
+  }
+}
+
+}  // namespace arrow
+
+
+#endif // ARROW_STATUS_H_

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/src/arrow/util/test_main.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/util/test_main.cc b/cpp/src/arrow/util/test_main.cc
new file mode 100644
index 0000000..00139f3
--- /dev/null
+++ b/cpp/src/arrow/util/test_main.cc
@@ -0,0 +1,26 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <gtest/gtest.h>
+
+int main(int argc, char **argv) {
+  ::testing::InitGoogleTest(&argc, argv);
+
+  int ret = RUN_ALL_TESTS();
+
+  return ret;
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/thirdparty/build_thirdparty.sh
----------------------------------------------------------------------
diff --git a/cpp/thirdparty/build_thirdparty.sh b/cpp/thirdparty/build_thirdparty.sh
new file mode 100755
index 0000000..46794de
--- /dev/null
+++ b/cpp/thirdparty/build_thirdparty.sh
@@ -0,0 +1,62 @@
+#!/bin/bash
+
+set -x
+set -e
+TP_DIR=$(cd "$(dirname "$BASH_SOURCE")"; pwd)
+
+source $TP_DIR/versions.sh
+PREFIX=$TP_DIR/installed
+
+################################################################################
+
+if [ "$#" = "0" ]; then
+  F_ALL=1
+else
+  # Allow passing specific libs to build on the command line
+  for arg in "$*"; do
+    case $arg in
+      "gtest")      F_GTEST=1 ;;
+      *)            echo "Unknown module: $arg"; exit 1 ;;
+    esac
+  done
+fi
+
+################################################################################
+
+# Determine how many parallel jobs to use for make based on the number of cores
+if [[ "$OSTYPE" =~ ^linux ]]; then
+  PARALLEL=$(grep -c processor /proc/cpuinfo)
+elif [[ "$OSTYPE" == "darwin"* ]]; then
+  PARALLEL=$(sysctl -n hw.ncpu)
+else
+  echo Unsupported platform $OSTYPE
+  exit 1
+fi
+
+mkdir -p "$PREFIX/include"
+mkdir -p "$PREFIX/lib"
+
+# On some systems, autotools installs libraries to lib64 rather than lib.  Fix
+# this by setting up lib64 as a symlink to lib.  We have to do this step first
+# to handle cases where one third-party library depends on another.
+ln -sf lib "$PREFIX/lib64"
+
+# use the compiled tools
+export PATH=$PREFIX/bin:$PATH
+
+
+# build googletest
+if [ -n "$F_ALL" -o -n "$F_GTEST" ]; then
+  cd $TP_DIR/$GTEST_BASEDIR
+
+  if [[ "$OSTYPE" == "darwin"* ]]; then
+    CXXFLAGS=-fPIC cmake -DCMAKE_CXX_FLAGS="-std=c++11 -stdlib=libc++ -DGTEST_USE_OWN_TR1_TUPLE=1 -Wno-unused-value -Wno-ignored-attributes"
+  else
+    CXXFLAGS=-fPIC cmake .
+  fi
+
+  make VERBOSE=1
+fi
+
+echo "---------------------"
+echo "Thirdparty dependencies built and installed into $PREFIX successfully"

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/thirdparty/download_thirdparty.sh
----------------------------------------------------------------------
diff --git a/cpp/thirdparty/download_thirdparty.sh b/cpp/thirdparty/download_thirdparty.sh
new file mode 100755
index 0000000..8ffb22a
--- /dev/null
+++ b/cpp/thirdparty/download_thirdparty.sh
@@ -0,0 +1,20 @@
+#!/bin/bash
+
+set -x
+set -e
+
+TP_DIR=$(cd "$(dirname "$BASH_SOURCE")"; pwd)
+
+source $TP_DIR/versions.sh
+
+download_extract_and_cleanup() {
+	filename=$TP_DIR/$(basename "$1")
+	curl -#LC - "$1" -o $filename
+	tar xzf $filename -C $TP_DIR
+	rm $filename
+}
+
+if [ ! -d ${GTEST_BASEDIR} ]; then
+  echo "Fetching gtest"
+  download_extract_and_cleanup $GTEST_URL
+fi

http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/thirdparty/versions.sh
----------------------------------------------------------------------
diff --git a/cpp/thirdparty/versions.sh b/cpp/thirdparty/versions.sh
new file mode 100755
index 0000000..12ad56e
--- /dev/null
+++ b/cpp/thirdparty/versions.sh
@@ -0,0 +1,3 @@
+GTEST_VERSION=1.7.0
+GTEST_URL="https://github.com/google/googletest/archive/release-${GTEST_VERSION}.tar.gz"
+GTEST_BASEDIR=googletest-release-$GTEST_VERSION


[02/17] arrow git commit: ARROW-1: Initial Arrow Code Commit

Posted by ja...@apache.org.
http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/util/DecimalUtility.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/util/DecimalUtility.java b/java/vector/src/main/java/org/apache/arrow/vector/util/DecimalUtility.java
new file mode 100644
index 0000000..576a5b6
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/util/DecimalUtility.java
@@ -0,0 +1,737 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.util;
+
+import io.netty.buffer.ArrowBuf;
+import io.netty.buffer.ByteBuf;
+import io.netty.buffer.UnpooledByteBufAllocator;
+
+import java.math.BigDecimal;
+import java.math.BigInteger;
+import java.nio.ByteBuffer;
+import java.util.Arrays;
+
+import org.apache.arrow.vector.holders.Decimal38SparseHolder;
+
+public class DecimalUtility extends CoreDecimalUtility{
+
+    public final static int MAX_DIGITS = 9;
+    public final static int DIGITS_BASE = 1000000000;
+    public final static int DIGITS_MAX = 999999999;
+    public final static int INTEGER_SIZE = (Integer.SIZE/8);
+
+    public final static String[] decimalToString = {"",
+            "0",
+            "00",
+            "000",
+            "0000",
+            "00000",
+            "000000",
+            "0000000",
+            "00000000",
+            "000000000"};
+
+    public final static long[] scale_long_constants = {
+        1,
+        10,
+        100,
+        1000,
+        10000,
+        100000,
+        1000000,
+        10000000,
+        100000000,
+        1000000000,
+        10000000000l,
+        100000000000l,
+        1000000000000l,
+        10000000000000l,
+        100000000000000l,
+        1000000000000000l,
+        10000000000000000l,
+        100000000000000000l,
+        1000000000000000000l};
+
+    /*
+     * Simple function that returns the static precomputed
+     * power of ten, instead of using Math.pow
+     */
+    public static long getPowerOfTen(int power) {
+      assert power >= 0 && power < scale_long_constants.length;
+      return scale_long_constants[(power)];
+    }
+
+    /*
+     * Math.pow returns a double and while multiplying with large digits
+     * in the decimal data type we encounter noise. So instead of multiplying
+     * with Math.pow we use the static constants to perform the multiplication
+     */
+    public static long adjustScaleMultiply(long input, int factor) {
+      int index = Math.abs(factor);
+      assert index >= 0 && index < scale_long_constants.length;
+      if (factor >= 0) {
+        return input * scale_long_constants[index];
+      } else {
+        return input / scale_long_constants[index];
+      }
+    }
+
+    public static long adjustScaleDivide(long input, int factor) {
+      int index = Math.abs(factor);
+      assert index >= 0 && index < scale_long_constants.length;
+      if (factor >= 0) {
+        return input / scale_long_constants[index];
+      } else {
+        return input * scale_long_constants[index];
+      }
+    }
+
+    /* Given the number of actual digits this function returns the
+     * number of indexes it will occupy in the array of integers
+     * which are stored in base 1 billion
+     */
+    public static int roundUp(int ndigits) {
+        return (ndigits + MAX_DIGITS - 1)/MAX_DIGITS;
+    }
+
+    /* Returns a string representation of the given integer
+     * If the length of the given integer is less than the
+     * passed length, this function will prepend zeroes to the string
+     */
+    public static StringBuilder toStringWithZeroes(int number, int desiredLength) {
+        String value = ((Integer) number).toString();
+        int length = value.length();
+
+        StringBuilder str = new StringBuilder();
+        str.append(decimalToString[desiredLength - length]);
+        str.append(value);
+
+        return str;
+    }
+
+    public static StringBuilder toStringWithZeroes(long number, int desiredLength) {
+        String value = ((Long) number).toString();
+        int length = value.length();
+
+        StringBuilder str = new StringBuilder();
+
+        // Desired length can be > MAX_DIGITS
+        int zeroesLength = desiredLength - length;
+        while (zeroesLength > MAX_DIGITS) {
+            str.append(decimalToString[MAX_DIGITS]);
+            zeroesLength -= MAX_DIGITS;
+        }
+        str.append(decimalToString[zeroesLength]);
+        str.append(value);
+
+        return str;
+    }
+
+  public static BigDecimal getBigDecimalFromIntermediate(ByteBuf data, int startIndex, int nDecimalDigits, int scale) {
+
+        // In the intermediate representation we don't pad the scale with zeroes, so set truncate = false
+        return getBigDecimalFromDrillBuf(data, startIndex, nDecimalDigits, scale, false);
+    }
+
+    public static BigDecimal getBigDecimalFromSparse(ArrowBuf data, int startIndex, int nDecimalDigits, int scale) {
+
+        // In the sparse representation we pad the scale with zeroes for ease of arithmetic, need to truncate
+        return getBigDecimalFromDrillBuf(data, startIndex, nDecimalDigits, scale, true);
+    }
+
+    public static BigDecimal getBigDecimalFromDrillBuf(ArrowBuf bytebuf, int start, int length, int scale) {
+      byte[] value = new byte[length];
+      bytebuf.getBytes(start, value, 0, length);
+      BigInteger unscaledValue = new BigInteger(value);
+      return new BigDecimal(unscaledValue, scale);
+    }
+
+  public static BigDecimal getBigDecimalFromByteBuffer(ByteBuffer bytebuf, int start, int length, int scale) {
+    byte[] value = new byte[length];
+    bytebuf.get(value);
+    BigInteger unscaledValue = new BigInteger(value);
+    return new BigDecimal(unscaledValue, scale);
+  }
+
+    /* Create a BigDecimal object using the data in the DrillBuf.
+     * This function assumes that data is provided in a non-dense format
+     * It works on both sparse and intermediate representations.
+     */
+  public static BigDecimal getBigDecimalFromDrillBuf(ByteBuf data, int startIndex, int nDecimalDigits, int scale,
+      boolean truncateScale) {
+
+        // For sparse decimal type we have padded zeroes at the end, strip them while converting to BigDecimal.
+        int actualDigits;
+
+        // Initialize the BigDecimal, first digit in the DrillBuf has the sign so mask it out
+        BigInteger decimalDigits = BigInteger.valueOf((data.getInt(startIndex)) & 0x7FFFFFFF);
+
+        BigInteger base = BigInteger.valueOf(DIGITS_BASE);
+
+        for (int i = 1; i < nDecimalDigits; i++) {
+
+            BigInteger temp = BigInteger.valueOf(data.getInt(startIndex + (i * INTEGER_SIZE)));
+            decimalDigits = decimalDigits.multiply(base);
+            decimalDigits = decimalDigits.add(temp);
+        }
+
+        // Truncate any additional padding we might have added
+        if (truncateScale == true && scale > 0 && (actualDigits = scale % MAX_DIGITS) != 0) {
+            BigInteger truncate = BigInteger.valueOf((int)Math.pow(10, (MAX_DIGITS - actualDigits)));
+            decimalDigits = decimalDigits.divide(truncate);
+        }
+
+        // set the sign
+        if ((data.getInt(startIndex) & 0x80000000) != 0) {
+            decimalDigits = decimalDigits.negate();
+        }
+
+        BigDecimal decimal = new BigDecimal(decimalDigits, scale);
+
+        return decimal;
+    }
+
+    /* This function returns a BigDecimal object from the dense decimal representation.
+     * First step is to convert the dense representation into an intermediate representation
+     * and then invoke getBigDecimalFromDrillBuf() to get the BigDecimal object
+     */
+    public static BigDecimal getBigDecimalFromDense(ArrowBuf data, int startIndex, int nDecimalDigits, int scale, int maxPrecision, int width) {
+
+        /* This method converts the dense representation to
+         * an intermediate representation. The intermediate
+         * representation has one more integer than the dense
+         * representation.
+         */
+        byte[] intermediateBytes = new byte[((nDecimalDigits + 1) * INTEGER_SIZE)];
+
+        // Start storing from the least significant byte of the first integer
+        int intermediateIndex = 3;
+
+        int[] mask = {0x03, 0x0F, 0x3F, 0xFF};
+        int[] reverseMask = {0xFC, 0xF0, 0xC0, 0x00};
+
+        int maskIndex;
+        int shiftOrder;
+        byte shiftBits;
+
+        // TODO: Some of the logic here is common with casting from Dense to Sparse types, factor out common code
+        if (maxPrecision == 38) {
+            maskIndex = 0;
+            shiftOrder = 6;
+            shiftBits = 0x00;
+            intermediateBytes[intermediateIndex++] = (byte) (data.getByte(startIndex) & 0x7F);
+        } else if (maxPrecision == 28) {
+            maskIndex = 1;
+            shiftOrder = 4;
+            shiftBits = (byte) ((data.getByte(startIndex) & 0x03) << shiftOrder);
+            intermediateBytes[intermediateIndex++] = (byte) (((data.getByte(startIndex) & 0x3C) & 0xFF) >>> 2);
+        } else {
+            throw new UnsupportedOperationException("Dense types with max precision 38 and 28 are only supported");
+        }
+
+        int inputIndex = 1;
+        boolean sign = false;
+
+        if ((data.getByte(startIndex) & 0x80) != 0) {
+            sign = true;
+        }
+
+        while (inputIndex < width) {
+
+            intermediateBytes[intermediateIndex] = (byte) ((shiftBits) | (((data.getByte(startIndex + inputIndex) & reverseMask[maskIndex]) & 0xFF) >>> (8 - shiftOrder)));
+
+            shiftBits = (byte) ((data.getByte(startIndex + inputIndex) & mask[maskIndex]) << shiftOrder);
+
+            inputIndex++;
+            intermediateIndex++;
+
+            if (((inputIndex - 1) % INTEGER_SIZE) == 0) {
+                shiftBits = (byte) ((shiftBits & 0xFF) >>> 2);
+                maskIndex++;
+                shiftOrder -= 2;
+            }
+
+        }
+        /* copy the last byte */
+        intermediateBytes[intermediateIndex] = shiftBits;
+
+        if (sign == true) {
+            intermediateBytes[0] = (byte) (intermediateBytes[0] | 0x80);
+        }
+
+    final ByteBuf intermediate = UnpooledByteBufAllocator.DEFAULT.buffer(intermediateBytes.length);
+    try {
+        intermediate.setBytes(0, intermediateBytes);
+
+      BigDecimal ret = getBigDecimalFromIntermediate(intermediate, 0, nDecimalDigits + 1, scale);
+      return ret;
+    } finally {
+      intermediate.release();
+    }
+
+    }
+
+    /*
+     * Function converts the BigDecimal and stores it in out internal sparse representation
+     */
+  public static void getSparseFromBigDecimal(BigDecimal input, ByteBuf data, int startIndex, int scale, int precision,
+      int nDecimalDigits) {
+
+        // Initialize the buffer
+        for (int i = 0; i < nDecimalDigits; i++) {
+          data.setInt(startIndex + (i * INTEGER_SIZE), 0);
+        }
+
+        boolean sign = false;
+
+        if (input.signum() == -1) {
+            // negative input
+            sign = true;
+            input = input.abs();
+        }
+
+        // Truncate the input as per the scale provided
+        input = input.setScale(scale, BigDecimal.ROUND_HALF_UP);
+
+        // Separate out the integer part
+        BigDecimal integerPart = input.setScale(0, BigDecimal.ROUND_DOWN);
+
+        int destIndex = nDecimalDigits - roundUp(scale) - 1;
+
+        // we use base 1 billion integer digits for out integernal representation
+        BigDecimal base = new BigDecimal(DIGITS_BASE);
+
+        while (integerPart.compareTo(BigDecimal.ZERO) == 1) {
+            // store the modulo as the integer value
+            data.setInt(startIndex + (destIndex * INTEGER_SIZE), (integerPart.remainder(base)).intValue());
+            destIndex--;
+            // Divide by base 1 billion
+            integerPart = (integerPart.divide(base)).setScale(0, BigDecimal.ROUND_DOWN);
+        }
+
+        /* Sparse representation contains padding of additional zeroes
+         * so each digit contains MAX_DIGITS for ease of arithmetic
+         */
+        int actualDigits;
+        if ((actualDigits = (scale % MAX_DIGITS)) != 0) {
+            // Pad additional zeroes
+            scale = scale + (MAX_DIGITS - actualDigits);
+            input = input.setScale(scale, BigDecimal.ROUND_DOWN);
+        }
+
+        //separate out the fractional part
+        BigDecimal fractionalPart = input.remainder(BigDecimal.ONE).movePointRight(scale);
+
+        destIndex = nDecimalDigits - 1;
+
+        while (scale > 0) {
+            // Get next set of MAX_DIGITS (9) store it in the DrillBuf
+            fractionalPart = fractionalPart.movePointLeft(MAX_DIGITS);
+            BigDecimal temp = fractionalPart.remainder(BigDecimal.ONE);
+
+            data.setInt(startIndex + (destIndex * INTEGER_SIZE), (temp.unscaledValue().intValue()));
+            destIndex--;
+
+            fractionalPart = fractionalPart.setScale(0, BigDecimal.ROUND_DOWN);
+            scale -= MAX_DIGITS;
+        }
+
+        // Set the negative sign
+        if (sign == true) {
+            data.setInt(startIndex, data.getInt(startIndex) | 0x80000000);
+        }
+
+    }
+
+
+    public static long getDecimal18FromBigDecimal(BigDecimal input, int scale, int precision) {
+        // Truncate or pad to set the input to the correct scale
+        input = input.setScale(scale, BigDecimal.ROUND_HALF_UP);
+
+        return (input.unscaledValue().longValue());
+    }
+
+    public static BigDecimal getBigDecimalFromPrimitiveTypes(int input, int scale, int precision) {
+      return BigDecimal.valueOf(input, scale);
+    }
+
+    public static BigDecimal getBigDecimalFromPrimitiveTypes(long input, int scale, int precision) {
+      return BigDecimal.valueOf(input, scale);
+    }
+
+
+    public static int compareDenseBytes(ArrowBuf left, int leftStart, boolean leftSign, ArrowBuf right, int rightStart, boolean rightSign, int width) {
+
+      int invert = 1;
+
+      /* If signs are different then simply look at the
+       * sign of the two inputs and determine which is greater
+       */
+      if (leftSign != rightSign) {
+
+        return((leftSign == true) ? -1 : 1);
+      } else if(leftSign == true) {
+        /* Both inputs are negative, at the end we will
+         * have to invert the comparison
+         */
+        invert = -1;
+      }
+
+      int cmp = 0;
+
+      for (int i = 0; i < width; i++) {
+        byte leftByte  = left.getByte(leftStart + i);
+        byte rightByte = right.getByte(rightStart + i);
+        // Unsigned byte comparison
+        if ((leftByte & 0xFF) > (rightByte & 0xFF)) {
+          cmp = 1;
+          break;
+        } else if ((leftByte & 0xFF) < (rightByte & 0xFF)) {
+          cmp = -1;
+          break;
+        }
+      }
+      cmp *= invert; // invert the comparison if both were negative values
+
+      return cmp;
+    }
+
+    public static int getIntegerFromSparseBuffer(ArrowBuf buffer, int start, int index) {
+      int value = buffer.getInt(start + (index * 4));
+
+      if (index == 0) {
+        /* the first byte contains sign bit, return value without it */
+        value = (value & 0x7FFFFFFF);
+      }
+      return value;
+    }
+
+    public static void setInteger(ArrowBuf buffer, int start, int index, int value) {
+      buffer.setInt(start + (index * 4), value);
+    }
+
+    public static int compareSparseBytes(ArrowBuf left, int leftStart, boolean leftSign, int leftScale, int leftPrecision, ArrowBuf right, int rightStart, boolean rightSign, int rightPrecision, int rightScale, int width, int nDecimalDigits, boolean absCompare) {
+
+      int invert = 1;
+
+      if (absCompare == false) {
+        if (leftSign != rightSign) {
+          return (leftSign == true) ? -1 : 1;
+        }
+
+        // Both values are negative invert the outcome of the comparison
+        if (leftSign == true) {
+          invert = -1;
+        }
+      }
+
+      int cmp = compareSparseBytesInner(left, leftStart, leftSign, leftScale, leftPrecision, right, rightStart, rightSign, rightPrecision, rightScale, width, nDecimalDigits);
+      return cmp * invert;
+    }
+    public static int compareSparseBytesInner(ArrowBuf left, int leftStart, boolean leftSign, int leftScale, int leftPrecision, ArrowBuf right, int rightStart, boolean rightSign, int rightPrecision, int rightScale, int width, int nDecimalDigits) {
+      /* compute the number of integer digits in each decimal */
+      int leftInt  = leftPrecision - leftScale;
+      int rightInt = rightPrecision - rightScale;
+
+      /* compute the number of indexes required for storing integer digits */
+      int leftIntRoundedUp = org.apache.arrow.vector.util.DecimalUtility.roundUp(leftInt);
+      int rightIntRoundedUp = org.apache.arrow.vector.util.DecimalUtility.roundUp(rightInt);
+
+      /* compute number of indexes required for storing scale */
+      int leftScaleRoundedUp = org.apache.arrow.vector.util.DecimalUtility.roundUp(leftScale);
+      int rightScaleRoundedUp = org.apache.arrow.vector.util.DecimalUtility.roundUp(rightScale);
+
+      /* compute index of the most significant integer digits */
+      int leftIndex1 = nDecimalDigits - leftScaleRoundedUp - leftIntRoundedUp;
+      int rightIndex1 = nDecimalDigits - rightScaleRoundedUp - rightIntRoundedUp;
+
+      int leftStopIndex = nDecimalDigits - leftScaleRoundedUp;
+      int rightStopIndex = nDecimalDigits - rightScaleRoundedUp;
+
+      /* Discard the zeroes in the integer part */
+      while (leftIndex1 < leftStopIndex) {
+        if (getIntegerFromSparseBuffer(left, leftStart, leftIndex1) != 0) {
+          break;
+        }
+
+        /* Digit in this location is zero, decrement the actual number
+         * of integer digits
+         */
+        leftIntRoundedUp--;
+        leftIndex1++;
+      }
+
+      /* If we reached the stop index then the number of integers is zero */
+      if (leftIndex1 == leftStopIndex) {
+        leftIntRoundedUp = 0;
+      }
+
+      while (rightIndex1 < rightStopIndex) {
+        if (getIntegerFromSparseBuffer(right, rightStart, rightIndex1) != 0) {
+          break;
+        }
+
+        /* Digit in this location is zero, decrement the actual number
+         * of integer digits
+         */
+        rightIntRoundedUp--;
+        rightIndex1++;
+      }
+
+      if (rightIndex1 == rightStopIndex) {
+        rightIntRoundedUp = 0;
+      }
+
+      /* We have the accurate number of non-zero integer digits,
+       * if the number of integer digits are different then we can determine
+       * which decimal is larger and needn't go down to comparing individual values
+       */
+      if (leftIntRoundedUp > rightIntRoundedUp) {
+        return 1;
+      }
+      else if (rightIntRoundedUp > leftIntRoundedUp) {
+        return -1;
+      }
+
+      /* The number of integer digits are the same, set the each index
+       * to the first non-zero integer and compare each digit
+       */
+      leftIndex1 = nDecimalDigits - leftScaleRoundedUp - leftIntRoundedUp;
+      rightIndex1 = nDecimalDigits - rightScaleRoundedUp - rightIntRoundedUp;
+
+      while (leftIndex1 < leftStopIndex && rightIndex1 < rightStopIndex) {
+        if (getIntegerFromSparseBuffer(left, leftStart, leftIndex1) > getIntegerFromSparseBuffer(right, rightStart, rightIndex1)) {
+          return 1;
+        }
+        else if (getIntegerFromSparseBuffer(right, rightStart, rightIndex1) > getIntegerFromSparseBuffer(left, leftStart, leftIndex1)) {
+          return -1;
+        }
+
+        leftIndex1++;
+        rightIndex1++;
+      }
+
+      /* The integer part of both the decimal's are equal, now compare
+       * each individual fractional part. Set the index to be at the
+       * beginning of the fractional part
+       */
+      leftIndex1 = leftStopIndex;
+      rightIndex1 = rightStopIndex;
+
+      /* Stop indexes will be the end of the array */
+      leftStopIndex = nDecimalDigits;
+      rightStopIndex = nDecimalDigits;
+
+      /* compare the two fractional parts of the decimal */
+      while (leftIndex1 < leftStopIndex && rightIndex1 < rightStopIndex) {
+        if (getIntegerFromSparseBuffer(left, leftStart, leftIndex1) > getIntegerFromSparseBuffer(right, rightStart, rightIndex1)) {
+          return 1;
+        }
+        else if (getIntegerFromSparseBuffer(right, rightStart, rightIndex1) > getIntegerFromSparseBuffer(left, leftStart, leftIndex1)) {
+          return -1;
+        }
+
+        leftIndex1++;
+        rightIndex1++;
+      }
+
+      /* Till now the fractional part of the decimals are equal, check
+       * if one of the decimal has fractional part that is remaining
+       * and is non-zero
+       */
+      while (leftIndex1 < leftStopIndex) {
+        if (getIntegerFromSparseBuffer(left, leftStart, leftIndex1) != 0) {
+          return 1;
+        }
+        leftIndex1++;
+      }
+
+      while(rightIndex1 < rightStopIndex) {
+        if (getIntegerFromSparseBuffer(right, rightStart, rightIndex1) != 0) {
+          return -1;
+        }
+        rightIndex1++;
+      }
+
+      /* Both decimal values are equal */
+      return 0;
+    }
+
+    public static BigDecimal getBigDecimalFromByteArray(byte[] bytes, int start, int length, int scale) {
+      byte[] value = Arrays.copyOfRange(bytes, start, start + length);
+      BigInteger unscaledValue = new BigInteger(value);
+      return new BigDecimal(unscaledValue, scale);
+    }
+
+  public static void roundDecimal(ArrowBuf result, int start, int nDecimalDigits, int desiredScale, int currentScale) {
+    int newScaleRoundedUp  = org.apache.arrow.vector.util.DecimalUtility.roundUp(desiredScale);
+    int origScaleRoundedUp = org.apache.arrow.vector.util.DecimalUtility.roundUp(currentScale);
+
+    if (desiredScale < currentScale) {
+
+      boolean roundUp = false;
+
+      //Extract the first digit to be truncated to check if we need to round up
+      int truncatedScaleIndex = desiredScale + 1;
+      if (truncatedScaleIndex <= currentScale) {
+        int extractDigitIndex = nDecimalDigits - origScaleRoundedUp -1;
+        extractDigitIndex += org.apache.arrow.vector.util.DecimalUtility.roundUp(truncatedScaleIndex);
+        int extractDigit = getIntegerFromSparseBuffer(result, start, extractDigitIndex);
+        int temp = org.apache.arrow.vector.util.DecimalUtility.MAX_DIGITS - (truncatedScaleIndex % org.apache.arrow.vector.util.DecimalUtility.MAX_DIGITS);
+        if (temp != 0) {
+          extractDigit = extractDigit / (int) (Math.pow(10, temp));
+        }
+        if ((extractDigit % 10)  > 4) {
+          roundUp = true;
+        }
+      }
+
+      // Get the source index beyond which we will truncate
+      int srcIntIndex = nDecimalDigits - origScaleRoundedUp - 1;
+      int srcIndex = srcIntIndex + newScaleRoundedUp;
+
+      // Truncate the remaining fractional part, move the integer part
+      int destIndex = nDecimalDigits - 1;
+      if (srcIndex != destIndex) {
+        while (srcIndex >= 0) {
+          setInteger(result, start, destIndex--, getIntegerFromSparseBuffer(result, start, srcIndex--));
+        }
+
+        // Set the remaining portion of the decimal to be zeroes
+        while (destIndex >= 0) {
+          setInteger(result, start, destIndex--, 0);
+        }
+        srcIndex = nDecimalDigits - 1;
+      }
+
+      // We truncated the decimal digit. Now we need to truncate within the base 1 billion fractional digit
+      int truncateFactor = org.apache.arrow.vector.util.DecimalUtility.MAX_DIGITS - (desiredScale % org.apache.arrow.vector.util.DecimalUtility.MAX_DIGITS);
+      if (truncateFactor != org.apache.arrow.vector.util.DecimalUtility.MAX_DIGITS) {
+        truncateFactor = (int) Math.pow(10, truncateFactor);
+        int fractionalDigits = getIntegerFromSparseBuffer(result, start, nDecimalDigits - 1);
+        fractionalDigits /= truncateFactor;
+        setInteger(result, start, nDecimalDigits - 1, fractionalDigits * truncateFactor);
+      }
+
+      // Finally round up the digit if needed
+      if (roundUp == true) {
+        srcIndex = nDecimalDigits - 1;
+        int carry;
+        if (truncateFactor != org.apache.arrow.vector.util.DecimalUtility.MAX_DIGITS) {
+          carry = truncateFactor;
+        } else {
+          carry = 1;
+        }
+
+        while (srcIndex >= 0) {
+          int value = getIntegerFromSparseBuffer(result, start, srcIndex);
+          value += carry;
+
+          if (value >= org.apache.arrow.vector.util.DecimalUtility.DIGITS_BASE) {
+            setInteger(result, start, srcIndex--, value % org.apache.arrow.vector.util.DecimalUtility.DIGITS_BASE);
+            carry = value / org.apache.arrow.vector.util.DecimalUtility.DIGITS_BASE;
+          } else {
+            setInteger(result, start, srcIndex--, value);
+            carry = 0;
+            break;
+          }
+        }
+      }
+    } else if (desiredScale > currentScale) {
+      // Add fractional digits to the decimal
+
+      // Check if we need to shift the decimal digits to the left
+      if (newScaleRoundedUp > origScaleRoundedUp) {
+        int srcIndex  = 0;
+        int destIndex = newScaleRoundedUp - origScaleRoundedUp;
+
+        // Check while extending scale, we are not overwriting integer part
+        while (srcIndex < destIndex) {
+          if (getIntegerFromSparseBuffer(result, start, srcIndex++) != 0) {
+            throw new RuntimeException("Truncate resulting in loss of integer part, reduce scale specified");
+          }
+        }
+
+        srcIndex = 0;
+        while (destIndex < nDecimalDigits) {
+          setInteger(result, start, srcIndex++, getIntegerFromSparseBuffer(result, start, destIndex++));
+        }
+
+        // Clear the remaining part
+        while (srcIndex < nDecimalDigits) {
+          setInteger(result, start, srcIndex++, 0);
+        }
+      }
+    }
+  }
+
+  public static int getFirstFractionalDigit(int decimal, int scale) {
+    if (scale == 0) {
+      return 0;
+    }
+    int temp = (int) adjustScaleDivide(decimal, scale - 1);
+    return Math.abs(temp % 10);
+  }
+
+  public static int getFirstFractionalDigit(long decimal, int scale) {
+    if (scale == 0) {
+      return 0;
+    }
+    long temp = adjustScaleDivide(decimal, scale - 1);
+    return (int) (Math.abs(temp % 10));
+  }
+
+  public static int getFirstFractionalDigit(ArrowBuf data, int scale, int start, int nDecimalDigits) {
+    if (scale == 0) {
+      return 0;
+    }
+
+    int index = nDecimalDigits - roundUp(scale);
+    return (int) (adjustScaleDivide(data.getInt(start + (index * INTEGER_SIZE)), MAX_DIGITS - 1));
+  }
+
+  public static int compareSparseSamePrecScale(ArrowBuf left, int lStart, byte[] right, int length) {
+    // check the sign first
+    boolean lSign = (left.getInt(lStart) & 0x80000000) != 0;
+    boolean rSign = ByteFunctionHelpers.getSign(right);
+    int cmp = 0;
+
+    if (lSign != rSign) {
+      return (lSign == false) ? 1 : -1;
+    }
+
+    // invert the comparison if we are comparing negative numbers
+    int invert = (lSign == true) ? -1 : 1;
+
+    // compare byte by byte
+    int n = 0;
+    int lPos = lStart;
+    int rPos = 0;
+    while (n < length/4) {
+      int leftInt = Decimal38SparseHolder.getInteger(n, lStart, left);
+      int rightInt = ByteFunctionHelpers.getInteger(right, n);
+      if (leftInt != rightInt) {
+        cmp =  (leftInt - rightInt ) > 0 ? 1 : -1;
+        break;
+      }
+      n++;
+    }
+    return cmp * invert;
+  }
+}
+

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/util/JsonStringArrayList.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/util/JsonStringArrayList.java b/java/vector/src/main/java/org/apache/arrow/vector/util/JsonStringArrayList.java
new file mode 100644
index 0000000..7aeaa12
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/util/JsonStringArrayList.java
@@ -0,0 +1,57 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.util;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+
+public class JsonStringArrayList<E> extends ArrayList<E> {
+
+  private static ObjectMapper mapper;
+
+  static {
+    mapper = new ObjectMapper();
+  }
+
+  @Override
+  public boolean equals(Object obj) {
+    if (this == obj) {
+      return true;
+    }
+    if (obj == null) {
+      return false;
+    }
+    if (!(obj instanceof List)) {
+      return false;
+    }
+    List other = (List) obj;
+    return this.size() == other.size() && this.containsAll(other);
+  }
+
+  @Override
+  public final String toString() {
+    try {
+      return mapper.writeValueAsString(this);
+    } catch(JsonProcessingException e) {
+      throw new IllegalStateException("Cannot serialize array list to JSON string", e);
+    }
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/util/JsonStringHashMap.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/util/JsonStringHashMap.java b/java/vector/src/main/java/org/apache/arrow/vector/util/JsonStringHashMap.java
new file mode 100644
index 0000000..750dd59
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/util/JsonStringHashMap.java
@@ -0,0 +1,76 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.util;
+
+import java.util.LinkedHashMap;
+import java.util.Map;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+
+/*
+ * Simple class that extends the regular java.util.HashMap but overrides the
+ * toString() method of the HashMap class to produce a JSON string instead
+ */
+public class JsonStringHashMap<K, V> extends LinkedHashMap<K, V> {
+
+  private static ObjectMapper mapper;
+
+  static {
+    mapper = new ObjectMapper();
+  }
+
+  @Override
+  public boolean equals(Object obj) {
+    if (this == obj) {
+      return true;
+    }
+    if (obj == null) {
+      return false;
+    }
+    if (!(obj instanceof Map)) {
+      return false;
+    }
+    Map other = (Map) obj;
+    if (this.size() != other.size()) {
+      return false;
+    }
+    for (K key : this.keySet()) {
+      if (this.get(key) == null ) {
+        if (other.get(key) == null) {
+          continue;
+        } else {
+          return false;
+        }
+      }
+      if ( ! this.get(key).equals(other.get(key))) {
+        return false;
+      }
+    }
+    return true;
+  }
+
+  @Override
+  public final String toString() {
+    try {
+      return mapper.writeValueAsString(this);
+    } catch(JsonProcessingException e) {
+      throw new IllegalStateException("Cannot serialize hash map to JSON string", e);
+    }
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/util/MapWithOrdinal.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/util/MapWithOrdinal.java b/java/vector/src/main/java/org/apache/arrow/vector/util/MapWithOrdinal.java
new file mode 100644
index 0000000..dea433e
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/util/MapWithOrdinal.java
@@ -0,0 +1,248 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.util;
+
+import java.util.AbstractMap;
+import java.util.Collection;
+import java.util.Map;
+import java.util.Set;
+
+import com.google.common.base.Function;
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Iterables;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+import io.netty.util.collection.IntObjectHashMap;
+import io.netty.util.collection.IntObjectMap;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * An implementation of map that supports constant time look-up by a generic key or an ordinal.
+ *
+ * This class extends the functionality a regular {@link Map} with ordinal lookup support.
+ * Upon insertion an unused ordinal is assigned to the inserted (key, value) tuple.
+ * Upon update the same ordinal id is re-used while value is replaced.
+ * Upon deletion of an existing item, its corresponding ordinal is recycled and could be used by another item.
+ *
+ * For any instance with N items, this implementation guarantees that ordinals are in the range of [0, N). However,
+ * the ordinal assignment is dynamic and may change after an insertion or deletion. Consumers of this class are
+ * responsible for explicitly checking the ordinal corresponding to a key via
+ * {@link org.apache.arrow.vector.util.MapWithOrdinal#getOrdinal(Object)} before attempting to execute a lookup
+ * with an ordinal.
+ *
+ * @param <K> key type
+ * @param <V> value type
+ */
+
+public class MapWithOrdinal<K, V> implements Map<K, V> {
+  private final static Logger logger = LoggerFactory.getLogger(MapWithOrdinal.class);
+
+  private final Map<K, Entry<Integer, V>> primary = Maps.newLinkedHashMap();
+  private final IntObjectHashMap<V> secondary = new IntObjectHashMap<>();
+
+  private final Map<K, V> delegate = new Map<K, V>() {
+    @Override
+    public boolean isEmpty() {
+      return size() == 0;
+    }
+
+    @Override
+    public int size() {
+      return primary.size();
+    }
+
+    @Override
+    public boolean containsKey(Object key) {
+      return primary.containsKey(key);
+    }
+
+    @Override
+    public boolean containsValue(Object value) {
+      return primary.containsValue(value);
+    }
+
+    @Override
+    public V get(Object key) {
+      Entry<Integer, V> pair = primary.get(key);
+      if (pair != null) {
+        return pair.getValue();
+      }
+      return null;
+    }
+
+    @Override
+    public V put(K key, V value) {
+      final Entry<Integer, V> oldPair = primary.get(key);
+      // if key exists try replacing otherwise, assign a new ordinal identifier
+      final int ordinal = oldPair == null ? primary.size():oldPair.getKey();
+      primary.put(key, new AbstractMap.SimpleImmutableEntry<>(ordinal, value));
+      secondary.put(ordinal, value);
+      return oldPair==null ? null:oldPair.getValue();
+    }
+
+    @Override
+    public V remove(Object key) {
+      final Entry<Integer, V> oldPair = primary.remove(key);
+      if (oldPair!=null) {
+        final int lastOrdinal = secondary.size();
+        final V last = secondary.get(lastOrdinal);
+        // normalize mappings so that all numbers until primary.size() is assigned
+        // swap the last element with the deleted one
+        secondary.put(oldPair.getKey(), last);
+        primary.put((K) key, new AbstractMap.SimpleImmutableEntry<>(oldPair.getKey(), last));
+      }
+      return oldPair==null ? null:oldPair.getValue();
+    }
+
+    @Override
+    public void putAll(Map<? extends K, ? extends V> m) {
+      throw new UnsupportedOperationException();
+    }
+
+    @Override
+    public void clear() {
+      primary.clear();
+      secondary.clear();
+    }
+
+    @Override
+    public Set<K> keySet() {
+      return primary.keySet();
+    }
+
+    @Override
+    public Collection<V> values() {
+      return Lists.newArrayList(Iterables.transform(secondary.entries(), new Function<IntObjectMap.Entry<V>, V>() {
+        @Override
+        public V apply(IntObjectMap.Entry<V> entry) {
+          return Preconditions.checkNotNull(entry).value();
+        }
+      }));
+    }
+
+    @Override
+    public Set<Entry<K, V>> entrySet() {
+      return Sets.newHashSet(Iterables.transform(primary.entrySet(), new Function<Entry<K, Entry<Integer, V>>, Entry<K, V>>() {
+        @Override
+        public Entry<K, V> apply(Entry<K, Entry<Integer, V>> entry) {
+          return new AbstractMap.SimpleImmutableEntry<>(entry.getKey(), entry.getValue().getValue());
+        }
+      }));
+    }
+  };
+
+  /**
+   * Returns the value corresponding to the given ordinal
+   *
+   * @param id ordinal value for lookup
+   * @return an instance of V
+   */
+  public V getByOrdinal(int id) {
+    return secondary.get(id);
+  }
+
+  /**
+   * Returns the ordinal corresponding to the given key.
+   *
+   * @param key key for ordinal lookup
+   * @return ordinal value corresponding to key if it exists or -1
+   */
+  public int getOrdinal(K key) {
+    Entry<Integer, V> pair = primary.get(key);
+    if (pair != null) {
+      return pair.getKey();
+    }
+    return -1;
+  }
+
+  @Override
+  public int size() {
+    return delegate.size();
+  }
+
+  @Override
+  public boolean isEmpty() {
+    return delegate.isEmpty();
+  }
+
+  @Override
+  public V get(Object key) {
+    return delegate.get(key);
+  }
+
+  /**
+   * Inserts the tuple (key, value) into the map extending the semantics of {@link Map#put} with automatic ordinal
+   * assignment. A new ordinal is assigned if key does not exists. Otherwise the same ordinal is re-used but the value
+   * is replaced.
+   *
+   * {@see java.util.Map#put}
+   */
+  @Override
+  public V put(K key, V value) {
+    return delegate.put(key, value);
+  }
+
+  @Override
+  public Collection<V> values() {
+    return delegate.values();
+  }
+
+  @Override
+  public boolean containsKey(Object key) {
+    return delegate.containsKey(key);
+  }
+
+  @Override
+  public boolean containsValue(Object value) {
+    return delegate.containsValue(value);
+  }
+
+  /**
+   * Removes the element corresponding to the key if exists extending the semantics of {@link Map#remove} with ordinal
+   * re-cycling. The ordinal corresponding to the given key may be re-assigned to another tuple. It is important that
+   * consumer checks the ordinal value via {@link #getOrdinal(Object)} before attempting to look-up by ordinal.
+   *
+   * {@see java.util.Map#remove}
+   */
+  @Override
+  public V remove(Object key) {
+    return delegate.remove(key);
+  }
+
+  @Override
+  public void putAll(Map<? extends K, ? extends V> m) {
+    delegate.putAll(m);
+  }
+
+  @Override
+  public void clear() {
+    delegate.clear();
+  }
+
+  @Override
+  public Set<K> keySet() {
+    return delegate.keySet();
+  }
+
+  @Override
+  public Set<Entry<K, V>> entrySet() {
+    return delegate.entrySet();
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/util/OversizedAllocationException.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/util/OversizedAllocationException.java b/java/vector/src/main/java/org/apache/arrow/vector/util/OversizedAllocationException.java
new file mode 100644
index 0000000..ec628b2
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/util/OversizedAllocationException.java
@@ -0,0 +1,49 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.util;
+
+
+/**
+ * An exception that is used to signal that allocation request in bytes is greater than the maximum allowed by
+ * {@link org.apache.arrow.memory.BufferAllocator#buffer(int) allocator}.
+ *
+ * <p>Operators should handle this exception to split the batch and later resume the execution on the next
+ * {@link RecordBatch#next() iteration}.</p>
+ *
+ */
+public class OversizedAllocationException extends RuntimeException {
+  public OversizedAllocationException() {
+    super();
+  }
+
+  public OversizedAllocationException(String message, Throwable cause, boolean enableSuppression, boolean writableStackTrace) {
+    super(message, cause, enableSuppression, writableStackTrace);
+  }
+
+  public OversizedAllocationException(String message, Throwable cause) {
+    super(message, cause);
+  }
+
+  public OversizedAllocationException(String message) {
+    super(message);
+  }
+
+  public OversizedAllocationException(Throwable cause) {
+    super(cause);
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/util/SchemaChangeRuntimeException.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/util/SchemaChangeRuntimeException.java b/java/vector/src/main/java/org/apache/arrow/vector/util/SchemaChangeRuntimeException.java
new file mode 100644
index 0000000..c281561
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/util/SchemaChangeRuntimeException.java
@@ -0,0 +1,41 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.util;
+
+
+public class SchemaChangeRuntimeException extends RuntimeException {
+  public SchemaChangeRuntimeException() {
+    super();
+  }
+
+  public SchemaChangeRuntimeException(String message, Throwable cause, boolean enableSuppression, boolean writableStackTrace) {
+    super(message, cause, enableSuppression, writableStackTrace);
+  }
+
+  public SchemaChangeRuntimeException(String message, Throwable cause) {
+    super(message, cause);
+  }
+
+  public SchemaChangeRuntimeException(String message) {
+    super(message);
+  }
+
+  public SchemaChangeRuntimeException(Throwable cause) {
+    super(cause);
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/util/Text.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/util/Text.java b/java/vector/src/main/java/org/apache/arrow/vector/util/Text.java
new file mode 100644
index 0000000..3919f06
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/util/Text.java
@@ -0,0 +1,621 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.util;
+
+import java.io.DataInput;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.CharBuffer;
+import java.nio.charset.CharacterCodingException;
+import java.nio.charset.Charset;
+import java.nio.charset.CharsetDecoder;
+import java.nio.charset.CharsetEncoder;
+import java.nio.charset.CodingErrorAction;
+import java.nio.charset.MalformedInputException;
+import java.text.CharacterIterator;
+import java.text.StringCharacterIterator;
+import java.util.Arrays;
+
+import com.fasterxml.jackson.core.JsonGenerationException;
+import com.fasterxml.jackson.core.JsonGenerator;
+import com.fasterxml.jackson.databind.SerializerProvider;
+import com.fasterxml.jackson.databind.annotation.JsonSerialize;
+import com.fasterxml.jackson.databind.ser.std.StdSerializer;
+
+/**
+ * A simplified byte wrapper similar to Hadoop's Text class without all the dependencies. Lifted from Hadoop 2.7.1
+ */
+@JsonSerialize(using = Text.TextSerializer.class)
+public class Text {
+
+  private static ThreadLocal<CharsetEncoder> ENCODER_FACTORY =
+      new ThreadLocal<CharsetEncoder>() {
+        @Override
+        protected CharsetEncoder initialValue() {
+          return Charset.forName("UTF-8").newEncoder().
+              onMalformedInput(CodingErrorAction.REPORT).
+              onUnmappableCharacter(CodingErrorAction.REPORT);
+        }
+      };
+
+  private static ThreadLocal<CharsetDecoder> DECODER_FACTORY =
+      new ThreadLocal<CharsetDecoder>() {
+        @Override
+        protected CharsetDecoder initialValue() {
+          return Charset.forName("UTF-8").newDecoder().
+              onMalformedInput(CodingErrorAction.REPORT).
+              onUnmappableCharacter(CodingErrorAction.REPORT);
+        }
+      };
+
+  private static final byte[] EMPTY_BYTES = new byte[0];
+
+  private byte[] bytes;
+  private int length;
+
+  public Text() {
+    bytes = EMPTY_BYTES;
+  }
+
+  /**
+   * Construct from a string.
+   */
+  public Text(String string) {
+    set(string);
+  }
+
+  /** Construct from another text. */
+  public Text(Text utf8) {
+    set(utf8);
+  }
+
+  /**
+   * Construct from a byte array.
+   */
+  public Text(byte[] utf8) {
+    set(utf8);
+  }
+
+  /**
+   * Get a copy of the bytes that is exactly the length of the data. See {@link #getBytes()} for faster access to the
+   * underlying array.
+   */
+  public byte[] copyBytes() {
+    byte[] result = new byte[length];
+    System.arraycopy(bytes, 0, result, 0, length);
+    return result;
+  }
+
+  /**
+   * Returns the raw bytes; however, only data up to {@link #getLength()} is valid. Please use {@link #copyBytes()} if
+   * you need the returned array to be precisely the length of the data.
+   */
+  public byte[] getBytes() {
+    return bytes;
+  }
+
+  /** Returns the number of bytes in the byte array */
+  public int getLength() {
+    return length;
+  }
+
+  /**
+   * Returns the Unicode Scalar Value (32-bit integer value) for the character at <code>position</code>. Note that this
+   * method avoids using the converter or doing String instantiation
+   *
+   * @return the Unicode scalar value at position or -1 if the position is invalid or points to a trailing byte
+   */
+  public int charAt(int position) {
+    if (position > this.length)
+    {
+      return -1; // too long
+    }
+    if (position < 0)
+    {
+      return -1; // duh.
+    }
+
+    ByteBuffer bb = (ByteBuffer) ByteBuffer.wrap(bytes).position(position);
+    return bytesToCodePoint(bb.slice());
+  }
+
+  public int find(String what) {
+    return find(what, 0);
+  }
+
+  /**
+   * Finds any occurence of <code>what</code> in the backing buffer, starting as position <code>start</code>. The
+   * starting position is measured in bytes and the return value is in terms of byte position in the buffer. The backing
+   * buffer is not converted to a string for this operation.
+   *
+   * @return byte position of the first occurence of the search string in the UTF-8 buffer or -1 if not found
+   */
+  public int find(String what, int start) {
+    try {
+      ByteBuffer src = ByteBuffer.wrap(this.bytes, 0, this.length);
+      ByteBuffer tgt = encode(what);
+      byte b = tgt.get();
+      src.position(start);
+
+      while (src.hasRemaining()) {
+        if (b == src.get()) { // matching first byte
+          src.mark(); // save position in loop
+          tgt.mark(); // save position in target
+          boolean found = true;
+          int pos = src.position() - 1;
+          while (tgt.hasRemaining()) {
+            if (!src.hasRemaining()) { // src expired first
+              tgt.reset();
+              src.reset();
+              found = false;
+              break;
+            }
+            if (!(tgt.get() == src.get())) {
+              tgt.reset();
+              src.reset();
+              found = false;
+              break; // no match
+            }
+          }
+          if (found) {
+            return pos;
+          }
+        }
+      }
+      return -1; // not found
+    } catch (CharacterCodingException e) {
+      // can't get here
+      e.printStackTrace();
+      return -1;
+    }
+  }
+
+  /**
+   * Set to contain the contents of a string.
+   */
+  public void set(String string) {
+    try {
+      ByteBuffer bb = encode(string, true);
+      bytes = bb.array();
+      length = bb.limit();
+    } catch (CharacterCodingException e) {
+      throw new RuntimeException("Should not have happened ", e);
+    }
+  }
+
+  /**
+   * Set to a utf8 byte array
+   */
+  public void set(byte[] utf8) {
+    set(utf8, 0, utf8.length);
+  }
+
+  /** copy a text. */
+  public void set(Text other) {
+    set(other.getBytes(), 0, other.getLength());
+  }
+
+  /**
+   * Set the Text to range of bytes
+   *
+   * @param utf8
+   *          the data to copy from
+   * @param start
+   *          the first position of the new string
+   * @param len
+   *          the number of bytes of the new string
+   */
+  public void set(byte[] utf8, int start, int len) {
+    setCapacity(len, false);
+    System.arraycopy(utf8, start, bytes, 0, len);
+    this.length = len;
+  }
+
+  /**
+   * Append a range of bytes to the end of the given text
+   *
+   * @param utf8
+   *          the data to copy from
+   * @param start
+   *          the first position to append from utf8
+   * @param len
+   *          the number of bytes to append
+   */
+  public void append(byte[] utf8, int start, int len) {
+    setCapacity(length + len, true);
+    System.arraycopy(utf8, start, bytes, length, len);
+    length += len;
+  }
+
+  /**
+   * Clear the string to empty.
+   *
+   * <em>Note</em>: For performance reasons, this call does not clear the underlying byte array that is retrievable via
+   * {@link #getBytes()}. In order to free the byte-array memory, call {@link #set(byte[])} with an empty byte array
+   * (For example, <code>new byte[0]</code>).
+   */
+  public void clear() {
+    length = 0;
+  }
+
+  /*
+   * Sets the capacity of this Text object to <em>at least</em> <code>len</code> bytes. If the current buffer is longer,
+   * then the capacity and existing content of the buffer are unchanged. If <code>len</code> is larger than the current
+   * capacity, the Text object's capacity is increased to match.
+   *
+   * @param len the number of bytes we need
+   *
+   * @param keepData should the old data be kept
+   */
+  private void setCapacity(int len, boolean keepData) {
+    if (bytes == null || bytes.length < len) {
+      if (bytes != null && keepData) {
+        bytes = Arrays.copyOf(bytes, Math.max(len, length << 1));
+      } else {
+        bytes = new byte[len];
+      }
+    }
+  }
+
+  /**
+   * Convert text back to string
+   *
+   * @see java.lang.Object#toString()
+   */
+  @Override
+  public String toString() {
+    try {
+      return decode(bytes, 0, length);
+    } catch (CharacterCodingException e) {
+      throw new RuntimeException("Should not have happened ", e);
+    }
+  }
+
+  /**
+   * Read a Text object whose length is already known. This allows creating Text from a stream which uses a different
+   * serialization format.
+   */
+  public void readWithKnownLength(DataInput in, int len) throws IOException {
+    setCapacity(len, false);
+    in.readFully(bytes, 0, len);
+    length = len;
+  }
+
+  /** Returns true iff <code>o</code> is a Text with the same contents. */
+  @Override
+  public boolean equals(Object o) {
+    if (!(o instanceof Text)) {
+      return false;
+    }
+
+    final Text that = (Text) o;
+    if (this.getLength() != that.getLength()) {
+      return false;
+    }
+
+    byte[] thisBytes = Arrays.copyOf(this.getBytes(), getLength());
+    byte[] thatBytes = Arrays.copyOf(that.getBytes(), getLength());
+    return Arrays.equals(thisBytes, thatBytes);
+
+  }
+
+  @Override
+  public int hashCode() {
+    return super.hashCode();
+  }
+
+  // / STATIC UTILITIES FROM HERE DOWN
+  /**
+   * Converts the provided byte array to a String using the UTF-8 encoding. If the input is malformed, replace by a
+   * default value.
+   */
+  public static String decode(byte[] utf8) throws CharacterCodingException {
+    return decode(ByteBuffer.wrap(utf8), true);
+  }
+
+  public static String decode(byte[] utf8, int start, int length)
+      throws CharacterCodingException {
+    return decode(ByteBuffer.wrap(utf8, start, length), true);
+  }
+
+  /**
+   * Converts the provided byte array to a String using the UTF-8 encoding. If <code>replace</code> is true, then
+   * malformed input is replaced with the substitution character, which is U+FFFD. Otherwise the method throws a
+   * MalformedInputException.
+   */
+  public static String decode(byte[] utf8, int start, int length, boolean replace)
+      throws CharacterCodingException {
+    return decode(ByteBuffer.wrap(utf8, start, length), replace);
+  }
+
+  private static String decode(ByteBuffer utf8, boolean replace)
+      throws CharacterCodingException {
+    CharsetDecoder decoder = DECODER_FACTORY.get();
+    if (replace) {
+      decoder.onMalformedInput(
+          java.nio.charset.CodingErrorAction.REPLACE);
+      decoder.onUnmappableCharacter(CodingErrorAction.REPLACE);
+    }
+    String str = decoder.decode(utf8).toString();
+    // set decoder back to its default value: REPORT
+    if (replace) {
+      decoder.onMalformedInput(CodingErrorAction.REPORT);
+      decoder.onUnmappableCharacter(CodingErrorAction.REPORT);
+    }
+    return str;
+  }
+
+  /**
+   * Converts the provided String to bytes using the UTF-8 encoding. If the input is malformed, invalid chars are
+   * replaced by a default value.
+   *
+   * @return ByteBuffer: bytes stores at ByteBuffer.array() and length is ByteBuffer.limit()
+   */
+
+  public static ByteBuffer encode(String string)
+      throws CharacterCodingException {
+    return encode(string, true);
+  }
+
+  /**
+   * Converts the provided String to bytes using the UTF-8 encoding. If <code>replace</code> is true, then malformed
+   * input is replaced with the substitution character, which is U+FFFD. Otherwise the method throws a
+   * MalformedInputException.
+   *
+   * @return ByteBuffer: bytes stores at ByteBuffer.array() and length is ByteBuffer.limit()
+   */
+  public static ByteBuffer encode(String string, boolean replace)
+      throws CharacterCodingException {
+    CharsetEncoder encoder = ENCODER_FACTORY.get();
+    if (replace) {
+      encoder.onMalformedInput(CodingErrorAction.REPLACE);
+      encoder.onUnmappableCharacter(CodingErrorAction.REPLACE);
+    }
+    ByteBuffer bytes =
+        encoder.encode(CharBuffer.wrap(string.toCharArray()));
+    if (replace) {
+      encoder.onMalformedInput(CodingErrorAction.REPORT);
+      encoder.onUnmappableCharacter(CodingErrorAction.REPORT);
+    }
+    return bytes;
+  }
+
+  static final public int DEFAULT_MAX_LEN = 1024 * 1024;
+
+  // //// states for validateUTF8
+
+  private static final int LEAD_BYTE = 0;
+
+  private static final int TRAIL_BYTE_1 = 1;
+
+  private static final int TRAIL_BYTE = 2;
+
+  /**
+   * Check if a byte array contains valid utf-8
+   *
+   * @param utf8
+   *          byte array
+   * @throws MalformedInputException
+   *           if the byte array contains invalid utf-8
+   */
+  public static void validateUTF8(byte[] utf8) throws MalformedInputException {
+    validateUTF8(utf8, 0, utf8.length);
+  }
+
+  /**
+   * Check to see if a byte array is valid utf-8
+   *
+   * @param utf8
+   *          the array of bytes
+   * @param start
+   *          the offset of the first byte in the array
+   * @param len
+   *          the length of the byte sequence
+   * @throws MalformedInputException
+   *           if the byte array contains invalid bytes
+   */
+  public static void validateUTF8(byte[] utf8, int start, int len)
+      throws MalformedInputException {
+    int count = start;
+    int leadByte = 0;
+    int length = 0;
+    int state = LEAD_BYTE;
+    while (count < start + len) {
+      int aByte = utf8[count] & 0xFF;
+
+      switch (state) {
+      case LEAD_BYTE:
+        leadByte = aByte;
+        length = bytesFromUTF8[aByte];
+
+        switch (length) {
+        case 0: // check for ASCII
+          if (leadByte > 0x7F) {
+            throw new MalformedInputException(count);
+          }
+          break;
+        case 1:
+          if (leadByte < 0xC2 || leadByte > 0xDF) {
+            throw new MalformedInputException(count);
+          }
+          state = TRAIL_BYTE_1;
+          break;
+        case 2:
+          if (leadByte < 0xE0 || leadByte > 0xEF) {
+            throw new MalformedInputException(count);
+          }
+          state = TRAIL_BYTE_1;
+          break;
+        case 3:
+          if (leadByte < 0xF0 || leadByte > 0xF4) {
+            throw new MalformedInputException(count);
+          }
+          state = TRAIL_BYTE_1;
+          break;
+        default:
+          // too long! Longest valid UTF-8 is 4 bytes (lead + three)
+          // or if < 0 we got a trail byte in the lead byte position
+          throw new MalformedInputException(count);
+        } // switch (length)
+        break;
+
+      case TRAIL_BYTE_1:
+        if (leadByte == 0xF0 && aByte < 0x90) {
+          throw new MalformedInputException(count);
+        }
+        if (leadByte == 0xF4 && aByte > 0x8F) {
+          throw new MalformedInputException(count);
+        }
+        if (leadByte == 0xE0 && aByte < 0xA0) {
+          throw new MalformedInputException(count);
+        }
+        if (leadByte == 0xED && aByte > 0x9F) {
+          throw new MalformedInputException(count);
+        }
+        // falls through to regular trail-byte test!!
+      case TRAIL_BYTE:
+        if (aByte < 0x80 || aByte > 0xBF) {
+          throw new MalformedInputException(count);
+        }
+        if (--length == 0) {
+          state = LEAD_BYTE;
+        } else {
+          state = TRAIL_BYTE;
+        }
+        break;
+      default:
+        break;
+      } // switch (state)
+      count++;
+    }
+  }
+
+  /**
+   * Magic numbers for UTF-8. These are the number of bytes that <em>follow</em> a given lead byte. Trailing bytes have
+   * the value -1. The values 4 and 5 are presented in this table, even though valid UTF-8 cannot include the five and
+   * six byte sequences.
+   */
+  static final int[] bytesFromUTF8 =
+  { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+      0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+      0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+      0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+      0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+      0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+      0, 0, 0, 0, 0, 0, 0,
+      // trail bytes
+      -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
+      -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
+      -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
+      -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 1, 1, 1, 1, 1,
+      1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+      1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3,
+      3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5 };
+
+  /**
+   * Returns the next code point at the current position in the buffer. The buffer's position will be incremented. Any
+   * mark set on this buffer will be changed by this method!
+   */
+  public static int bytesToCodePoint(ByteBuffer bytes) {
+    bytes.mark();
+    byte b = bytes.get();
+    bytes.reset();
+    int extraBytesToRead = bytesFromUTF8[(b & 0xFF)];
+    if (extraBytesToRead < 0)
+    {
+      return -1; // trailing byte!
+    }
+    int ch = 0;
+
+    switch (extraBytesToRead) {
+    case 5:
+      ch += (bytes.get() & 0xFF);
+      ch <<= 6; /* remember, illegal UTF-8 */
+    case 4:
+      ch += (bytes.get() & 0xFF);
+      ch <<= 6; /* remember, illegal UTF-8 */
+    case 3:
+      ch += (bytes.get() & 0xFF);
+      ch <<= 6;
+    case 2:
+      ch += (bytes.get() & 0xFF);
+      ch <<= 6;
+    case 1:
+      ch += (bytes.get() & 0xFF);
+      ch <<= 6;
+    case 0:
+      ch += (bytes.get() & 0xFF);
+    }
+    ch -= offsetsFromUTF8[extraBytesToRead];
+
+    return ch;
+  }
+
+  static final int offsetsFromUTF8[] =
+  { 0x00000000, 0x00003080,
+      0x000E2080, 0x03C82080, 0xFA082080, 0x82082080 };
+
+  /**
+   * For the given string, returns the number of UTF-8 bytes required to encode the string.
+   *
+   * @param string
+   *          text to encode
+   * @return number of UTF-8 bytes required to encode
+   */
+  public static int utf8Length(String string) {
+    CharacterIterator iter = new StringCharacterIterator(string);
+    char ch = iter.first();
+    int size = 0;
+    while (ch != CharacterIterator.DONE) {
+      if ((ch >= 0xD800) && (ch < 0xDC00)) {
+        // surrogate pair?
+        char trail = iter.next();
+        if ((trail > 0xDBFF) && (trail < 0xE000)) {
+          // valid pair
+          size += 4;
+        } else {
+          // invalid pair
+          size += 3;
+          iter.previous(); // rewind one
+        }
+      } else if (ch < 0x80) {
+        size++;
+      } else if (ch < 0x800) {
+        size += 2;
+      } else {
+        // ch < 0x10000, that is, the largest char value
+        size += 3;
+      }
+      ch = iter.next();
+    }
+    return size;
+  }
+
+  public static class TextSerializer extends StdSerializer<Text> {
+
+    public TextSerializer() {
+      super(Text.class);
+    }
+
+    @Override
+    public void serialize(Text text, JsonGenerator jsonGenerator, SerializerProvider serializerProvider)
+        throws IOException, JsonGenerationException {
+      jsonGenerator.writeString(text.toString());
+    }
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/util/TransferPair.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/util/TransferPair.java b/java/vector/src/main/java/org/apache/arrow/vector/util/TransferPair.java
new file mode 100644
index 0000000..6e68d55
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/util/TransferPair.java
@@ -0,0 +1,27 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.util;
+
+import org.apache.arrow.vector.ValueVector;
+
+public interface TransferPair {
+  public void transfer();
+  public void splitAndTransfer(int startIndex, int length);
+  public ValueVector getTo();
+  public void copyValueSafe(int from, int to);
+}


[07/17] arrow git commit: ARROW-1: Initial Arrow Code Commit

Posted by ja...@apache.org.
http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/HolderReaderImpl.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/HolderReaderImpl.java b/java/vector/src/main/codegen/templates/HolderReaderImpl.java
new file mode 100644
index 0000000..3005fca
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/HolderReaderImpl.java
@@ -0,0 +1,290 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+<@pp.dropOutputFile />
+<#list vv.types as type>
+<#list type.minor as minor>
+<#list ["", "Nullable", "Repeated"] as holderMode>
+<#assign nullMode = holderMode />
+<#if holderMode == "Repeated"><#assign nullMode = "Nullable" /></#if>
+
+<#assign lowerName = minor.class?uncap_first />
+<#if lowerName == "int" ><#assign lowerName = "integer" /></#if>
+<#assign name = minor.class?cap_first />
+<#assign javaType = (minor.javaType!type.javaType) />
+<#assign friendlyType = (minor.friendlyType!minor.boxedType!type.boxedType) />
+<#assign safeType=friendlyType />
+<#if safeType=="byte[]"><#assign safeType="ByteArray" /></#if>
+<#assign fields = minor.fields!type.fields />
+
+<@pp.changeOutputFile name="/org/apache/arrow/vector/complex/impl/${holderMode}${name}HolderReaderImpl.java" />
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.complex.impl;
+
+<#include "/@includes/vv_imports.ftl" />
+
+import java.math.BigDecimal;
+import java.math.BigInteger;
+
+import org.joda.time.Period;
+
+// Source code generated using FreeMarker template ${.template_name}
+
+@SuppressWarnings("unused")
+public class ${holderMode}${name}HolderReaderImpl extends AbstractFieldReader {
+
+  private ${nullMode}${name}Holder holder;
+<#if holderMode == "Repeated" >
+  private int index = -1;
+  private ${holderMode}${name}Holder repeatedHolder;
+</#if>
+
+  public ${holderMode}${name}HolderReaderImpl(${holderMode}${name}Holder holder) {
+<#if holderMode == "Repeated" >
+    this.holder = new ${nullMode}${name}Holder();
+    this.repeatedHolder = holder;
+<#else>
+    this.holder = holder;
+</#if>
+  }
+
+  @Override
+  public int size() {
+<#if holderMode == "Repeated">
+    return repeatedHolder.end - repeatedHolder.start;
+<#else>
+    throw new UnsupportedOperationException("You can't call size on a Holder value reader.");
+</#if>
+  }
+
+  @Override
+  public boolean next() {
+<#if holderMode == "Repeated">
+    if(index + 1 < repeatedHolder.end) {
+      index++;
+      repeatedHolder.vector.getAccessor().get(repeatedHolder.start + index, holder);
+      return true;
+    } else {
+      return false;
+    }
+<#else>
+    throw new UnsupportedOperationException("You can't call next on a single value reader.");
+</#if>
+
+  }
+
+  @Override
+  public void setPosition(int index) {
+    throw new UnsupportedOperationException("You can't call next on a single value reader.");
+  }
+
+  @Override
+  public MajorType getType() {
+<#if holderMode == "Repeated">
+    return this.repeatedHolder.TYPE;
+<#else>
+    return this.holder.TYPE;
+</#if>
+  }
+
+  @Override
+  public boolean isSet() {
+    <#if holderMode == "Repeated">
+    return this.repeatedHolder.end!=this.repeatedHolder.start;
+    <#elseif nullMode == "Nullable">
+    return this.holder.isSet == 1;
+    <#else>
+    return true;
+    </#if>
+    
+  }
+
+<#if holderMode != "Repeated">
+@Override
+  public void read(${name}Holder h) {
+  <#list fields as field>
+    h.${field.name} = holder.${field.name};
+  </#list>
+  }
+
+  @Override
+  public void read(Nullable${name}Holder h) {
+  <#list fields as field>
+    h.${field.name} = holder.${field.name};
+  </#list>
+    h.isSet = isSet() ? 1 : 0;
+  }
+</#if>
+
+<#if holderMode == "Repeated">
+  @Override
+  public ${friendlyType} read${safeType}(int index){
+    repeatedHolder.vector.getAccessor().get(repeatedHolder.start + index, holder);
+    ${friendlyType} value = read${safeType}();
+    if (this.index > -1) {
+      repeatedHolder.vector.getAccessor().get(repeatedHolder.start + this.index, holder);
+    }
+    return value;
+  }
+</#if>
+
+  @Override
+  public ${friendlyType} read${safeType}(){
+<#if nullMode == "Nullable">
+    if (!isSet()) {
+      return null;
+    }
+</#if>
+
+<#if type.major == "VarLen">
+
+      int length = holder.end - holder.start;
+      byte[] value = new byte [length];
+      holder.buffer.getBytes(holder.start, value, 0, length);
+
+<#if minor.class == "VarBinary">
+      return value;
+<#elseif minor.class == "Var16Char">
+      return new String(value);
+<#elseif minor.class == "VarChar">
+      Text text = new Text();
+      text.set(value);
+      return text;
+</#if>
+
+<#elseif minor.class == "Interval">
+      Period p = new Period();
+      return p.plusMonths(holder.months).plusDays(holder.days).plusMillis(holder.milliseconds);
+
+<#elseif minor.class == "IntervalDay">
+      Period p = new Period();
+      return p.plusDays(holder.days).plusMillis(holder.milliseconds);
+
+<#elseif minor.class == "Decimal9" ||
+         minor.class == "Decimal18" >
+      BigInteger value = BigInteger.valueOf(holder.value);
+      return new BigDecimal(value, holder.scale);
+
+<#elseif minor.class == "Decimal28Dense" ||
+         minor.class == "Decimal38Dense">
+      return org.apache.arrow.vector.util.DecimalUtility.getBigDecimalFromDense(holder.buffer,
+                                                                                holder.start,
+                                                                                holder.nDecimalDigits,
+                                                                                holder.scale,
+                                                                                holder.maxPrecision,
+                                                                                holder.WIDTH);
+
+<#elseif minor.class == "Decimal28Sparse" ||
+         minor.class == "Decimal38Sparse">
+      return org.apache.arrow.vector.util.DecimalUtility.getBigDecimalFromSparse(holder.buffer,
+                                                                                 holder.start,
+                                                                                 holder.nDecimalDigits,
+                                                                                 holder.scale);
+
+<#elseif minor.class == "Bit" >
+      return new Boolean(holder.value != 0);
+<#else>
+      ${friendlyType} value = new ${friendlyType}(this.holder.value);
+      return value;
+</#if>
+
+  }
+
+  @Override
+  public Object readObject() {
+<#if holderMode == "Repeated" >
+    List<Object> valList = Lists.newArrayList();
+    for (int i = repeatedHolder.start; i < repeatedHolder.end; i++) {
+      valList.add(repeatedHolder.vector.getAccessor().getObject(i));
+    }
+    return valList;
+<#else>
+    return readSingleObject();
+</#if>
+  }
+
+  private Object readSingleObject() {
+<#if nullMode == "Nullable">
+    if (!isSet()) {
+      return null;
+    }
+</#if>
+
+<#if type.major == "VarLen">
+      int length = holder.end - holder.start;
+      byte[] value = new byte [length];
+      holder.buffer.getBytes(holder.start, value, 0, length);
+
+<#if minor.class == "VarBinary">
+      return value;
+<#elseif minor.class == "Var16Char">
+      return new String(value);
+<#elseif minor.class == "VarChar">
+      Text text = new Text();
+      text.set(value);
+      return text;
+</#if>
+
+<#elseif minor.class == "Interval">
+      Period p = new Period();
+      return p.plusMonths(holder.months).plusDays(holder.days).plusMillis(holder.milliseconds);
+
+<#elseif minor.class == "IntervalDay">
+      Period p = new Period();
+      return p.plusDays(holder.days).plusMillis(holder.milliseconds);
+
+<#elseif minor.class == "Decimal9" ||
+         minor.class == "Decimal18" >
+      BigInteger value = BigInteger.valueOf(holder.value);
+      return new BigDecimal(value, holder.scale);
+
+<#elseif minor.class == "Decimal28Dense" ||
+         minor.class == "Decimal38Dense">
+      return org.apache.arrow.vector.util.DecimalUtility.getBigDecimalFromDense(holder.buffer,
+                                                                                holder.start,
+                                                                                holder.nDecimalDigits,
+                                                                                holder.scale,
+                                                                                holder.maxPrecision,
+                                                                                holder.WIDTH);
+
+<#elseif minor.class == "Decimal28Sparse" ||
+         minor.class == "Decimal38Sparse">
+      return org.apache.arrow.vector.util.DecimalUtility.getBigDecimalFromSparse(holder.buffer,
+                                                                                 holder.start,
+                                                                                 holder.nDecimalDigits,
+                                                                                 holder.scale);
+
+<#elseif minor.class == "Bit" >
+      return new Boolean(holder.value != 0);
+<#else>
+      ${friendlyType} value = new ${friendlyType}(this.holder.value);
+      return value;
+</#if>
+  }
+
+<#if holderMode != "Repeated" && nullMode != "Nullable">
+  public void copyAsValue(${minor.class?cap_first}Writer writer){
+    writer.write(holder);
+  }
+</#if>
+}
+
+</#list>
+</#list>
+</#list>

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/ListWriters.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/ListWriters.java b/java/vector/src/main/codegen/templates/ListWriters.java
new file mode 100644
index 0000000..cf9fa30
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/ListWriters.java
@@ -0,0 +1,234 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+<@pp.dropOutputFile />
+
+<#list ["Single", "Repeated"] as mode>
+<@pp.changeOutputFile name="/org/apache/arrow/vector/complex/impl/${mode}ListWriter.java" />
+
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.complex.impl;
+<#if mode == "Single">
+  <#assign containerClass = "AbstractContainerVector" />
+  <#assign index = "idx()">
+<#else>
+  <#assign containerClass = "RepeatedListVector" />
+  <#assign index = "currentChildIndex">
+</#if>
+
+
+<#include "/@includes/vv_imports.ftl" />
+
+/*
+ * This class is generated using FreeMarker and the ${.template_name} template.
+ */
+@SuppressWarnings("unused")
+public class ${mode}ListWriter extends AbstractFieldWriter {
+  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(${mode}ListWriter.class);
+
+  static enum Mode { INIT, IN_MAP, IN_LIST <#list vv.types as type><#list type.minor as minor>, IN_${minor.class?upper_case}</#list></#list> }
+
+  private final String name;
+  protected final ${containerClass} container;
+  private Mode mode = Mode.INIT;
+  private FieldWriter writer;
+  protected RepeatedValueVector innerVector;
+
+  <#if mode == "Repeated">private int currentChildIndex = 0;</#if>
+  public ${mode}ListWriter(String name, ${containerClass} container, FieldWriter parent){
+    super(parent);
+    this.name = name;
+    this.container = container;
+  }
+
+  public ${mode}ListWriter(${containerClass} container, FieldWriter parent){
+    super(parent);
+    this.name = null;
+    this.container = container;
+  }
+
+  @Override
+  public void allocate() {
+    if(writer != null) {
+      writer.allocate();
+    }
+
+    <#if mode == "Repeated">
+    container.allocateNew();
+    </#if>
+  }
+
+  @Override
+  public void clear() {
+    if (writer != null) {
+      writer.clear();
+    }
+  }
+
+  @Override
+  public void close() {
+    clear();
+    container.close();
+    if (innerVector != null) {
+      innerVector.close();
+    }
+  }
+
+  @Override
+  public int getValueCapacity() {
+    return innerVector == null ? 0 : innerVector.getValueCapacity();
+  }
+
+  public void setValueCount(int count){
+    if(innerVector != null) innerVector.getMutator().setValueCount(count);
+  }
+
+  @Override
+  public MapWriter map() {
+    switch(mode) {
+    case INIT:
+      int vectorCount = container.size();
+      final RepeatedMapVector vector = container.addOrGet(name, RepeatedMapVector.TYPE, RepeatedMapVector.class);
+      innerVector = vector;
+      writer = new RepeatedMapWriter(vector, this);
+      if(vectorCount != container.size()) {
+        writer.allocate();
+      }
+      writer.setPosition(${index});
+      mode = Mode.IN_MAP;
+      return writer;
+    case IN_MAP:
+      return writer;
+    }
+
+    throw new RuntimeException(getUnsupportedErrorMsg("MAP", mode.name()));
+
+  }
+
+  @Override
+  public ListWriter list() {
+    switch(mode) {
+    case INIT:
+      final int vectorCount = container.size();
+      final RepeatedListVector vector = container.addOrGet(name, RepeatedListVector.TYPE, RepeatedListVector.class);
+      innerVector = vector;
+      writer = new RepeatedListWriter(null, vector, this);
+      if(vectorCount != container.size()) {
+        writer.allocate();
+      }
+      writer.setPosition(${index});
+      mode = Mode.IN_LIST;
+      return writer;
+    case IN_LIST:
+      return writer;
+    }
+
+    throw new RuntimeException(getUnsupportedErrorMsg("LIST", mode.name()));
+
+  }
+
+  <#list vv.types as type><#list type.minor as minor>
+  <#assign lowerName = minor.class?uncap_first />
+  <#assign upperName = minor.class?upper_case />
+  <#assign capName = minor.class?cap_first />
+  <#if lowerName == "int" ><#assign lowerName = "integer" /></#if>
+
+  private static final MajorType ${upperName}_TYPE = Types.repeated(MinorType.${upperName});
+
+  @Override
+  public ${capName}Writer ${lowerName}() {
+    switch(mode) {
+    case INIT:
+      final int vectorCount = container.size();
+      final Repeated${capName}Vector vector = container.addOrGet(name, ${upperName}_TYPE, Repeated${capName}Vector.class);
+      innerVector = vector;
+      writer = new Repeated${capName}WriterImpl(vector, this);
+      if(vectorCount != container.size()) {
+        writer.allocate();
+      }
+      writer.setPosition(${index});
+      mode = Mode.IN_${upperName};
+      return writer;
+    case IN_${upperName}:
+      return writer;
+    }
+
+    throw new RuntimeException(getUnsupportedErrorMsg("${upperName}", mode.name()));
+
+  }
+  </#list></#list>
+
+  public MaterializedField getField() {
+    return container.getField();
+  }
+
+  <#if mode == "Repeated">
+
+  public void startList() {
+    final RepeatedListVector list = (RepeatedListVector) container;
+    final RepeatedListVector.RepeatedMutator mutator = list.getMutator();
+
+    // make sure that the current vector can support the end position of this list.
+    if(container.getValueCapacity() <= idx()) {
+      mutator.setValueCount(idx()+1);
+    }
+
+    // update the repeated vector to state that there is current+1 objects.
+    final RepeatedListHolder h = new RepeatedListHolder();
+    list.getAccessor().get(idx(), h);
+    if (h.start >= h.end) {
+      mutator.startNewValue(idx());
+    }
+    currentChildIndex = container.getMutator().add(idx());
+    if(writer != null) {
+      writer.setPosition(currentChildIndex);
+    }
+  }
+
+  public void endList() {
+    // noop, we initialize state at start rather than end.
+  }
+  <#else>
+
+  public void setPosition(int index) {
+    super.setPosition(index);
+    if(writer != null) {
+      writer.setPosition(index);
+    }
+  }
+
+  public void startList() {
+    // noop
+  }
+
+  public void endList() {
+    // noop
+  }
+  </#if>
+
+  private String getUnsupportedErrorMsg(String expected, String found) {
+    final String f = found.substring(3);
+    return String.format("In a list of type %s, encountered a value of type %s. "+
+      "Drill does not support lists of different types.",
+       f, expected
+    );
+  }
+}
+</#list>

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/MapWriters.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/MapWriters.java b/java/vector/src/main/codegen/templates/MapWriters.java
new file mode 100644
index 0000000..7001367
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/MapWriters.java
@@ -0,0 +1,240 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+<@pp.dropOutputFile />
+<#list ["Single", "Repeated"] as mode>
+<@pp.changeOutputFile name="/org/apache/arrow/vector/complex/impl/${mode}MapWriter.java" />
+<#if mode == "Single">
+<#assign containerClass = "MapVector" />
+<#assign index = "idx()">
+<#else>
+<#assign containerClass = "RepeatedMapVector" />
+<#assign index = "currentChildIndex">
+</#if>
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.complex.impl;
+
+<#include "/@includes/vv_imports.ftl" />
+import java.util.Map;
+
+import org.apache.arrow.vector.holders.RepeatedMapHolder;
+import org.apache.arrow.vector.AllocationHelper;
+import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.complex.writer.FieldWriter;
+
+import com.google.common.collect.Maps;
+
+/*
+ * This class is generated using FreeMarker and the ${.template_name} template.
+ */
+@SuppressWarnings("unused")
+public class ${mode}MapWriter extends AbstractFieldWriter {
+
+  protected final ${containerClass} container;
+  private final Map<String, FieldWriter> fields = Maps.newHashMap();
+  <#if mode == "Repeated">private int currentChildIndex = 0;</#if>
+
+  private final boolean unionEnabled;
+
+  public ${mode}MapWriter(${containerClass} container, FieldWriter parent, boolean unionEnabled) {
+    super(parent);
+    this.container = container;
+    this.unionEnabled = unionEnabled;
+  }
+
+  public ${mode}MapWriter(${containerClass} container, FieldWriter parent) {
+    this(container, parent, false);
+  }
+
+  @Override
+  public int getValueCapacity() {
+    return container.getValueCapacity();
+  }
+
+  @Override
+  public boolean isEmptyMap() {
+    return 0 == container.size();
+  }
+
+  @Override
+  public MaterializedField getField() {
+      return container.getField();
+  }
+
+  @Override
+  public MapWriter map(String name) {
+      FieldWriter writer = fields.get(name.toLowerCase());
+    if(writer == null){
+      int vectorCount=container.size();
+        MapVector vector = container.addOrGet(name, MapVector.TYPE, MapVector.class);
+      if(!unionEnabled){
+        writer = new SingleMapWriter(vector, this);
+      } else {
+        writer = new PromotableWriter(vector, container);
+      }
+      if(vectorCount != container.size()) {
+        writer.allocate();
+      }
+      writer.setPosition(${index});
+      fields.put(name.toLowerCase(), writer);
+    }
+    return writer;
+  }
+
+  @Override
+  public void close() throws Exception {
+    clear();
+    container.close();
+  }
+
+  @Override
+  public void allocate() {
+    container.allocateNew();
+    for(final FieldWriter w : fields.values()) {
+      w.allocate();
+    }
+  }
+
+  @Override
+  public void clear() {
+    container.clear();
+    for(final FieldWriter w : fields.values()) {
+      w.clear();
+    }
+  }
+
+  @Override
+  public ListWriter list(String name) {
+    FieldWriter writer = fields.get(name.toLowerCase());
+    int vectorCount = container.size();
+    if(writer == null) {
+      if (!unionEnabled){
+        writer = new SingleListWriter(name,container,this);
+      } else{
+        writer = new PromotableWriter(container.addOrGet(name, Types.optional(MinorType.LIST), ListVector.class), container);
+      }
+      if (container.size() > vectorCount) {
+        writer.allocate();
+      }
+      writer.setPosition(${index});
+      fields.put(name.toLowerCase(), writer);
+    }
+    return writer;
+  }
+
+  <#if mode == "Repeated">
+  public void start() {
+      // update the repeated vector to state that there is current+1 objects.
+    final RepeatedMapHolder h = new RepeatedMapHolder();
+    final RepeatedMapVector map = (RepeatedMapVector) container;
+    final RepeatedMapVector.Mutator mutator = map.getMutator();
+
+    // Make sure that the current vector can support the end position of this list.
+    if(container.getValueCapacity() <= idx()) {
+      mutator.setValueCount(idx()+1);
+    }
+
+    map.getAccessor().get(idx(), h);
+    if (h.start >= h.end) {
+      container.getMutator().startNewValue(idx());
+    }
+    currentChildIndex = container.getMutator().add(idx());
+    for(final FieldWriter w : fields.values()) {
+      w.setPosition(currentChildIndex);
+    }
+  }
+
+
+  public void end() {
+    // noop
+  }
+  <#else>
+
+  public void setValueCount(int count) {
+    container.getMutator().setValueCount(count);
+  }
+
+  @Override
+  public void setPosition(int index) {
+    super.setPosition(index);
+    for(final FieldWriter w: fields.values()) {
+      w.setPosition(index);
+    }
+  }
+
+  @Override
+  public void start() {
+  }
+
+  @Override
+  public void end() {
+  }
+
+  </#if>
+
+  <#list vv.types as type><#list type.minor as minor>
+  <#assign lowerName = minor.class?uncap_first />
+  <#if lowerName == "int" ><#assign lowerName = "integer" /></#if>
+  <#assign upperName = minor.class?upper_case />
+  <#assign capName = minor.class?cap_first />
+  <#assign vectName = capName />
+  <#assign vectName = "Nullable${capName}" />
+
+  <#if minor.class?starts_with("Decimal") >
+  public ${minor.class}Writer ${lowerName}(String name) {
+    // returns existing writer
+    final FieldWriter writer = fields.get(name.toLowerCase());
+    assert writer != null;
+    return writer;
+  }
+
+  public ${minor.class}Writer ${lowerName}(String name, int scale, int precision) {
+    final MajorType ${upperName}_TYPE = new MajorType(MinorType.${upperName}, DataMode.OPTIONAL, scale, precision, null, null);
+  <#else>
+  private static final MajorType ${upperName}_TYPE = Types.optional(MinorType.${upperName});
+  @Override
+  public ${minor.class}Writer ${lowerName}(String name) {
+  </#if>
+    FieldWriter writer = fields.get(name.toLowerCase());
+    if(writer == null) {
+      ValueVector vector;
+      ValueVector currentVector = container.getChild(name);
+      if (unionEnabled){
+        ${vectName}Vector v = container.addOrGet(name, ${upperName}_TYPE, ${vectName}Vector.class);
+        writer = new PromotableWriter(v, container);
+        vector = v;
+      } else {
+        ${vectName}Vector v = container.addOrGet(name, ${upperName}_TYPE, ${vectName}Vector.class);
+        writer = new ${vectName}WriterImpl(v, this);
+        vector = v;
+      }
+      if (currentVector == null || currentVector != vector) {
+        vector.allocateNewSafe();
+      } 
+      writer.setPosition(${index});
+      fields.put(name.toLowerCase(), writer);
+    }
+    return writer;
+  }
+
+  </#list></#list>
+
+}
+</#list>

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/NullReader.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/NullReader.java b/java/vector/src/main/codegen/templates/NullReader.java
new file mode 100644
index 0000000..3ef6c7d
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/NullReader.java
@@ -0,0 +1,138 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+<@pp.dropOutputFile />
+<@pp.changeOutputFile name="/org/apache/arrow/vector/complex/impl/NullReader.java" />
+
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.complex.impl;
+
+<#include "/@includes/vv_imports.ftl" />
+
+
+@SuppressWarnings("unused")
+public class NullReader extends AbstractBaseReader implements FieldReader{
+  
+  public static final NullReader INSTANCE = new NullReader();
+  public static final NullReader EMPTY_LIST_INSTANCE = new NullReader(Types.repeated(MinorType.NULL));
+  public static final NullReader EMPTY_MAP_INSTANCE = new NullReader(Types.required(MinorType.MAP));
+  private MajorType type;
+  
+  private NullReader(){
+    super();
+    type = Types.required(MinorType.NULL);
+  }
+
+  private NullReader(MajorType type){
+    super();
+    this.type = type;
+  }
+
+  @Override
+  public MajorType getType() {
+    return type;
+  }
+  
+  public void copyAsValue(MapWriter writer) {}
+
+  public void copyAsValue(ListWriter writer) {}
+
+  public void copyAsValue(UnionWriter writer) {}
+
+  <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first />
+  public void read(${name}Holder holder){
+    throw new UnsupportedOperationException("NullReader cannot write into non-nullable holder");
+  }
+
+  public void read(Nullable${name}Holder holder){
+    holder.isSet = 0;
+  }
+
+  public void read(int arrayIndex, ${name}Holder holder){
+    throw new ArrayIndexOutOfBoundsException();
+  }
+  
+  public void copyAsValue(${minor.class}Writer writer){}
+  public void copyAsField(String name, ${minor.class}Writer writer){}
+
+  public void read(int arrayIndex, Nullable${name}Holder holder){
+    throw new ArrayIndexOutOfBoundsException();
+  }
+  </#list></#list>
+  
+  public int size(){
+    return 0;
+  }
+  
+  public boolean isSet(){
+    return false;
+  }
+  
+  public boolean next(){
+    return false;
+  }
+  
+  public RepeatedMapReader map(){
+    return this;
+  }
+  
+  public RepeatedListReader list(){
+    return this;
+  }
+  
+  public MapReader map(String name){
+    return this;
+  }
+  
+  public ListReader list(String name){
+    return this;
+  }
+  
+  public FieldReader reader(String name){
+    return this;
+  }
+  
+  public FieldReader reader(){
+    return this;
+  }
+  
+  private void fail(String name){
+    throw new IllegalArgumentException(String.format("You tried to read a %s type when you are using a ValueReader of type %s.", name, this.getClass().getSimpleName()));
+  }
+  
+  <#list ["Object", "BigDecimal", "Integer", "Long", "Boolean", 
+          "Character", "DateTime", "Period", "Double", "Float",
+          "Text", "String", "Byte", "Short", "byte[]"] as friendlyType>
+  <#assign safeType=friendlyType />
+  <#if safeType=="byte[]"><#assign safeType="ByteArray" /></#if>
+  
+  public ${friendlyType} read${safeType}(int arrayIndex){
+    return null;
+  }
+  
+  public ${friendlyType} read${safeType}(){
+    return null;
+  }
+  </#list>
+  
+}
+
+
+

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/NullableValueVectors.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/NullableValueVectors.java b/java/vector/src/main/codegen/templates/NullableValueVectors.java
new file mode 100644
index 0000000..6893a25
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/NullableValueVectors.java
@@ -0,0 +1,630 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+<@pp.dropOutputFile />
+<#list vv.types as type>
+<#list type.minor as minor>
+
+<#assign className = "Nullable${minor.class}Vector" />
+<#assign valuesName = "${minor.class}Vector" />
+<#assign friendlyType = (minor.friendlyType!minor.boxedType!type.boxedType) />
+
+<@pp.changeOutputFile name="/org/apache/arrow/vector/${className}.java" />
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector;
+
+<#include "/@includes/vv_imports.ftl" />
+
+/**
+ * Nullable${minor.class} implements a vector of values which could be null.  Elements in the vector
+ * are first checked against a fixed length vector of boolean values.  Then the element is retrieved
+ * from the base class (if not null).
+ *
+ * NB: this class is automatically generated from ${.template_name} and ValueVectorTypes.tdd using FreeMarker.
+ */
+@SuppressWarnings("unused")
+public final class ${className} extends BaseDataValueVector implements <#if type.major == "VarLen">VariableWidth<#else>FixedWidth</#if>Vector, NullableVector{
+  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(${className}.class);
+
+  private final FieldReader reader = new Nullable${minor.class}ReaderImpl(Nullable${minor.class}Vector.this);
+
+  private final MaterializedField bitsField = MaterializedField.create("$bits$", new MajorType(MinorType.UINT1, DataMode.REQUIRED));
+  private final UInt1Vector bits = new UInt1Vector(bitsField, allocator);
+  private final ${valuesName} values = new ${minor.class}Vector(field, allocator);
+
+  private final Mutator mutator = new Mutator();
+  private final Accessor accessor = new Accessor();
+
+  public ${className}(MaterializedField field, BufferAllocator allocator) {
+    super(field, allocator);
+  }
+
+  @Override
+  public FieldReader getReader(){
+    return reader;
+  }
+
+  @Override
+  public int getValueCapacity(){
+    return Math.min(bits.getValueCapacity(), values.getValueCapacity());
+  }
+
+  @Override
+  public ArrowBuf[] getBuffers(boolean clear) {
+    final ArrowBuf[] buffers = ObjectArrays.concat(bits.getBuffers(false), values.getBuffers(false), ArrowBuf.class);
+    if (clear) {
+      for (final ArrowBuf buffer:buffers) {
+        buffer.retain(1);
+      }
+      clear();
+    }
+    return buffers;
+  }
+
+  @Override
+  public void close() {
+    bits.close();
+    values.close();
+    super.close();
+  }
+
+  @Override
+  public void clear() {
+    bits.clear();
+    values.clear();
+    super.clear();
+  }
+
+  @Override
+  public int getBufferSize(){
+    return values.getBufferSize() + bits.getBufferSize();
+  }
+
+  @Override
+  public int getBufferSizeFor(final int valueCount) {
+    if (valueCount == 0) {
+      return 0;
+    }
+
+    return values.getBufferSizeFor(valueCount)
+        + bits.getBufferSizeFor(valueCount);
+  }
+
+  @Override
+  public ArrowBuf getBuffer() {
+    return values.getBuffer();
+  }
+
+  @Override
+  public ${valuesName} getValuesVector() {
+    return values;
+  }
+
+  @Override
+  public void setInitialCapacity(int numRecords) {
+    bits.setInitialCapacity(numRecords);
+    values.setInitialCapacity(numRecords);
+  }
+
+//  @Override
+//  public SerializedField.Builder getMetadataBuilder() {
+//    return super.getMetadataBuilder()
+//      .addChild(bits.getMetadata())
+//      .addChild(values.getMetadata());
+//  }
+
+  @Override
+  public void allocateNew() {
+    if(!allocateNewSafe()){
+      throw new OutOfMemoryException("Failure while allocating buffer.");
+    }
+  }
+
+  @Override
+  public boolean allocateNewSafe() {
+    /* Boolean to keep track if all the memory allocations were successful
+     * Used in the case of composite vectors when we need to allocate multiple
+     * buffers for multiple vectors. If one of the allocations failed we need to
+     * clear all the memory that we allocated
+     */
+    boolean success = false;
+    try {
+      success = values.allocateNewSafe() && bits.allocateNewSafe();
+    } finally {
+      if (!success) {
+        clear();
+      }
+    }
+    bits.zeroVector();
+    mutator.reset();
+    accessor.reset();
+    return success;
+  }
+
+  <#if type.major == "VarLen">
+  @Override
+  public void allocateNew(int totalBytes, int valueCount) {
+    try {
+      values.allocateNew(totalBytes, valueCount);
+      bits.allocateNew(valueCount);
+    } catch(RuntimeException e) {
+      clear();
+      throw e;
+    }
+    bits.zeroVector();
+    mutator.reset();
+    accessor.reset();
+  }
+
+  public void reset() {
+    bits.zeroVector();
+    mutator.reset();
+    accessor.reset();
+    super.reset();
+  }
+
+  @Override
+  public int getByteCapacity(){
+    return values.getByteCapacity();
+  }
+
+  @Override
+  public int getCurrentSizeInBytes(){
+    return values.getCurrentSizeInBytes();
+  }
+
+  <#else>
+  @Override
+  public void allocateNew(int valueCount) {
+    try {
+      values.allocateNew(valueCount);
+      bits.allocateNew(valueCount+1);
+    } catch(OutOfMemoryException e) {
+      clear();
+      throw e;
+    }
+    bits.zeroVector();
+    mutator.reset();
+    accessor.reset();
+  }
+
+  @Override
+  public void reset() {
+    bits.zeroVector();
+    mutator.reset();
+    accessor.reset();
+    super.reset();
+  }
+
+  /**
+   * {@inheritDoc}
+   */
+  @Override
+  public void zeroVector() {
+    bits.zeroVector();
+    values.zeroVector();
+  }
+  </#if>
+
+
+//  @Override
+//  public void load(SerializedField metadata, ArrowBuf buffer) {
+//    clear();
+    // the bits vector is the first child (the order in which the children are added in getMetadataBuilder is significant)
+//    final SerializedField bitsField = metadata.getChild(0);
+//    bits.load(bitsField, buffer);
+//
+//    final int capacity = buffer.capacity();
+//    final int bitsLength = bitsField.getBufferLength();
+//    final SerializedField valuesField = metadata.getChild(1);
+//    values.load(valuesField, buffer.slice(bitsLength, capacity - bitsLength));
+//  }
+
+  @Override
+  public TransferPair getTransferPair(BufferAllocator allocator){
+    return new TransferImpl(getField(), allocator);
+  }
+
+  @Override
+  public TransferPair getTransferPair(String ref, BufferAllocator allocator){
+    return new TransferImpl(getField().withPath(ref), allocator);
+  }
+
+  @Override
+  public TransferPair makeTransferPair(ValueVector to) {
+    return new TransferImpl((Nullable${minor.class}Vector) to);
+  }
+
+  public void transferTo(Nullable${minor.class}Vector target){
+    bits.transferTo(target.bits);
+    values.transferTo(target.values);
+    <#if type.major == "VarLen">
+    target.mutator.lastSet = mutator.lastSet;
+    </#if>
+    clear();
+  }
+
+  public void splitAndTransferTo(int startIndex, int length, Nullable${minor.class}Vector target) {
+    bits.splitAndTransferTo(startIndex, length, target.bits);
+    values.splitAndTransferTo(startIndex, length, target.values);
+    <#if type.major == "VarLen">
+    target.mutator.lastSet = length - 1;
+    </#if>
+  }
+
+  private class TransferImpl implements TransferPair {
+    Nullable${minor.class}Vector to;
+
+    public TransferImpl(MaterializedField field, BufferAllocator allocator){
+      to = new Nullable${minor.class}Vector(field, allocator);
+    }
+
+    public TransferImpl(Nullable${minor.class}Vector to){
+      this.to = to;
+    }
+
+    @Override
+    public Nullable${minor.class}Vector getTo(){
+      return to;
+    }
+
+    @Override
+    public void transfer(){
+      transferTo(to);
+    }
+
+    @Override
+    public void splitAndTransfer(int startIndex, int length) {
+      splitAndTransferTo(startIndex, length, to);
+    }
+
+    @Override
+    public void copyValueSafe(int fromIndex, int toIndex) {
+      to.copyFromSafe(fromIndex, toIndex, Nullable${minor.class}Vector.this);
+    }
+  }
+
+  @Override
+  public Accessor getAccessor(){
+    return accessor;
+  }
+
+  @Override
+  public Mutator getMutator(){
+    return mutator;
+  }
+
+  public ${minor.class}Vector convertToRequiredVector(){
+    ${minor.class}Vector v = new ${minor.class}Vector(getField().getOtherNullableVersion(), allocator);
+    if (v.data != null) {
+      v.data.release(1);
+    }
+    v.data = values.data;
+    v.data.retain(1);
+    clear();
+    return v;
+  }
+
+  public void copyFrom(int fromIndex, int thisIndex, Nullable${minor.class}Vector from){
+    final Accessor fromAccessor = from.getAccessor();
+    if (!fromAccessor.isNull(fromIndex)) {
+      mutator.set(thisIndex, fromAccessor.get(fromIndex));
+    }
+  }
+
+  public void copyFromSafe(int fromIndex, int thisIndex, ${minor.class}Vector from){
+    <#if type.major == "VarLen">
+    mutator.fillEmpties(thisIndex);
+    </#if>
+    values.copyFromSafe(fromIndex, thisIndex, from);
+    bits.getMutator().setSafe(thisIndex, 1);
+  }
+
+  public void copyFromSafe(int fromIndex, int thisIndex, Nullable${minor.class}Vector from){
+    <#if type.major == "VarLen">
+    mutator.fillEmpties(thisIndex);
+    </#if>
+    bits.copyFromSafe(fromIndex, thisIndex, from.bits);
+    values.copyFromSafe(fromIndex, thisIndex, from.values);
+  }
+
+  public final class Accessor extends BaseDataValueVector.BaseAccessor <#if type.major = "VarLen">implements VariableWidthVector.VariableWidthAccessor</#if> {
+    final UInt1Vector.Accessor bAccessor = bits.getAccessor();
+    final ${valuesName}.Accessor vAccessor = values.getAccessor();
+
+    /**
+     * Get the element at the specified position.
+     *
+     * @param   index   position of the value
+     * @return  value of the element, if not null
+     * @throws  NullValueException if the value is null
+     */
+    public <#if type.major == "VarLen">byte[]<#else>${minor.javaType!type.javaType}</#if> get(int index) {
+      if (isNull(index)) {
+          throw new IllegalStateException("Can't get a null value");
+      }
+      return vAccessor.get(index);
+    }
+
+    @Override
+    public boolean isNull(int index) {
+      return isSet(index) == 0;
+    }
+
+    public int isSet(int index){
+      return bAccessor.get(index);
+    }
+
+    <#if type.major == "VarLen">
+    public long getStartEnd(int index){
+      return vAccessor.getStartEnd(index);
+    }
+
+    @Override
+    public int getValueLength(int index) {
+      return values.getAccessor().getValueLength(index);
+    }
+    </#if>
+
+    public void get(int index, Nullable${minor.class}Holder holder){
+      vAccessor.get(index, holder);
+      holder.isSet = bAccessor.get(index);
+
+      <#if minor.class.startsWith("Decimal")>
+      holder.scale = getField().getScale();
+      holder.precision = getField().getPrecision();
+      </#if>
+    }
+
+    @Override
+    public ${friendlyType} getObject(int index) {
+      if (isNull(index)) {
+          return null;
+      }else{
+        return vAccessor.getObject(index);
+      }
+    }
+
+    <#if minor.class == "Interval" || minor.class == "IntervalDay" || minor.class == "IntervalYear">
+    public StringBuilder getAsStringBuilder(int index) {
+      if (isNull(index)) {
+          return null;
+      }else{
+        return vAccessor.getAsStringBuilder(index);
+      }
+    }
+    </#if>
+
+    @Override
+    public int getValueCount(){
+      return bits.getAccessor().getValueCount();
+    }
+
+    public void reset(){}
+  }
+
+  public final class Mutator extends BaseDataValueVector.BaseMutator implements NullableVectorDefinitionSetter<#if type.major = "VarLen">, VariableWidthVector.VariableWidthMutator</#if> {
+    private int setCount;
+    <#if type.major = "VarLen"> private int lastSet = -1;</#if>
+
+    private Mutator(){
+    }
+
+    public ${valuesName} getVectorWithValues(){
+      return values;
+    }
+
+    @Override
+    public void setIndexDefined(int index){
+      bits.getMutator().set(index, 1);
+    }
+
+    /**
+     * Set the variable length element at the specified index to the supplied byte array.
+     *
+     * @param index   position of the bit to set
+     * @param bytes   array of bytes to write
+     */
+    public void set(int index, <#if type.major == "VarLen">byte[]<#elseif (type.width < 4)>int<#else>${minor.javaType!type.javaType}</#if> value) {
+      setCount++;
+      final ${valuesName}.Mutator valuesMutator = values.getMutator();
+      final UInt1Vector.Mutator bitsMutator = bits.getMutator();
+      <#if type.major == "VarLen">
+      for (int i = lastSet + 1; i < index; i++) {
+        valuesMutator.set(i, emptyByteArray);
+      }
+      </#if>
+      bitsMutator.set(index, 1);
+      valuesMutator.set(index, value);
+      <#if type.major == "VarLen">lastSet = index;</#if>
+    }
+
+    <#if type.major == "VarLen">
+
+    private void fillEmpties(int index){
+      final ${valuesName}.Mutator valuesMutator = values.getMutator();
+      for (int i = lastSet; i < index; i++) {
+        valuesMutator.setSafe(i + 1, emptyByteArray);
+      }
+      while(index > bits.getValueCapacity()) {
+        bits.reAlloc();
+      }
+      lastSet = index;
+    }
+
+    @Override
+    public void setValueLengthSafe(int index, int length) {
+      values.getMutator().setValueLengthSafe(index, length);
+      lastSet = index;
+    }
+    </#if>
+
+    public void setSafe(int index, byte[] value, int start, int length) {
+      <#if type.major != "VarLen">
+      throw new UnsupportedOperationException();
+      <#else>
+      fillEmpties(index);
+
+      bits.getMutator().setSafe(index, 1);
+      values.getMutator().setSafe(index, value, start, length);
+      setCount++;
+      <#if type.major == "VarLen">lastSet = index;</#if>
+      </#if>
+    }
+
+    public void setSafe(int index, ByteBuffer value, int start, int length) {
+      <#if type.major != "VarLen">
+      throw new UnsupportedOperationException();
+      <#else>
+      fillEmpties(index);
+
+      bits.getMutator().setSafe(index, 1);
+      values.getMutator().setSafe(index, value, start, length);
+      setCount++;
+      <#if type.major == "VarLen">lastSet = index;</#if>
+      </#if>
+    }
+
+    public void setNull(int index){
+      bits.getMutator().setSafe(index, 0);
+    }
+
+    public void setSkipNull(int index, ${minor.class}Holder holder){
+      values.getMutator().set(index, holder);
+    }
+
+    public void setSkipNull(int index, Nullable${minor.class}Holder holder){
+      values.getMutator().set(index, holder);
+    }
+
+
+    public void set(int index, Nullable${minor.class}Holder holder){
+      final ${valuesName}.Mutator valuesMutator = values.getMutator();
+      <#if type.major == "VarLen">
+      for (int i = lastSet + 1; i < index; i++) {
+        valuesMutator.set(i, emptyByteArray);
+      }
+      </#if>
+      bits.getMutator().set(index, holder.isSet);
+      valuesMutator.set(index, holder);
+      <#if type.major == "VarLen">lastSet = index;</#if>
+    }
+
+    public void set(int index, ${minor.class}Holder holder){
+      final ${valuesName}.Mutator valuesMutator = values.getMutator();
+      <#if type.major == "VarLen">
+      for (int i = lastSet + 1; i < index; i++) {
+        valuesMutator.set(i, emptyByteArray);
+      }
+      </#if>
+      bits.getMutator().set(index, 1);
+      valuesMutator.set(index, holder);
+      <#if type.major == "VarLen">lastSet = index;</#if>
+    }
+
+    public boolean isSafe(int outIndex) {
+      return outIndex < Nullable${minor.class}Vector.this.getValueCapacity();
+    }
+
+    <#assign fields = minor.fields!type.fields />
+    public void set(int index, int isSet<#list fields as field><#if field.include!true >, ${field.type} ${field.name}Field</#if></#list> ){
+      final ${valuesName}.Mutator valuesMutator = values.getMutator();
+      <#if type.major == "VarLen">
+      for (int i = lastSet + 1; i < index; i++) {
+        valuesMutator.set(i, emptyByteArray);
+      }
+      </#if>
+      bits.getMutator().set(index, isSet);
+      valuesMutator.set(index<#list fields as field><#if field.include!true >, ${field.name}Field</#if></#list>);
+      <#if type.major == "VarLen">lastSet = index;</#if>
+    }
+
+    public void setSafe(int index, int isSet<#list fields as field><#if field.include!true >, ${field.type} ${field.name}Field</#if></#list> ) {
+      <#if type.major == "VarLen">
+      fillEmpties(index);
+      </#if>
+
+      bits.getMutator().setSafe(index, isSet);
+      values.getMutator().setSafe(index<#list fields as field><#if field.include!true >, ${field.name}Field</#if></#list>);
+      setCount++;
+      <#if type.major == "VarLen">lastSet = index;</#if>
+    }
+
+
+    public void setSafe(int index, Nullable${minor.class}Holder value) {
+
+      <#if type.major == "VarLen">
+      fillEmpties(index);
+      </#if>
+      bits.getMutator().setSafe(index, value.isSet);
+      values.getMutator().setSafe(index, value);
+      setCount++;
+      <#if type.major == "VarLen">lastSet = index;</#if>
+    }
+
+    public void setSafe(int index, ${minor.class}Holder value) {
+
+      <#if type.major == "VarLen">
+      fillEmpties(index);
+      </#if>
+      bits.getMutator().setSafe(index, 1);
+      values.getMutator().setSafe(index, value);
+      setCount++;
+      <#if type.major == "VarLen">lastSet = index;</#if>
+    }
+
+    <#if !(type.major == "VarLen" || minor.class == "Decimal28Sparse" || minor.class == "Decimal38Sparse" || minor.class == "Decimal28Dense" || minor.class == "Decimal38Dense" || minor.class == "Interval" || minor.class == "IntervalDay")>
+      public void setSafe(int index, ${minor.javaType!type.javaType} value) {
+        <#if type.major == "VarLen">
+        fillEmpties(index);
+        </#if>
+        bits.getMutator().setSafe(index, 1);
+        values.getMutator().setSafe(index, value);
+        setCount++;
+      }
+
+    </#if>
+
+    @Override
+    public void setValueCount(int valueCount) {
+      assert valueCount >= 0;
+      <#if type.major == "VarLen">
+      fillEmpties(valueCount);
+      </#if>
+      values.getMutator().setValueCount(valueCount);
+      bits.getMutator().setValueCount(valueCount);
+    }
+
+    @Override
+    public void generateTestData(int valueCount){
+      bits.getMutator().generateTestDataAlt(valueCount);
+      values.getMutator().generateTestData(valueCount);
+      <#if type.major = "VarLen">lastSet = valueCount;</#if>
+      setValueCount(valueCount);
+    }
+
+    @Override
+    public void reset(){
+      setCount = 0;
+      <#if type.major = "VarLen">lastSet = -1;</#if>
+    }
+  }
+}
+</#list>
+</#list>

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/RepeatedValueVectors.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/RepeatedValueVectors.java b/java/vector/src/main/codegen/templates/RepeatedValueVectors.java
new file mode 100644
index 0000000..5ac80f5
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/RepeatedValueVectors.java
@@ -0,0 +1,421 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+<@pp.dropOutputFile />
+<#list vv.types as type>
+<#list type.minor as minor>
+<#assign friendlyType = (minor.friendlyType!minor.boxedType!type.boxedType) />
+<#assign fields = minor.fields!type.fields />
+
+<@pp.changeOutputFile name="/org/apache/arrow/vector/Repeated${minor.class}Vector.java" />
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector;
+
+<#include "/@includes/vv_imports.ftl" />
+
+/**
+ * Repeated${minor.class} implements a vector with multple values per row (e.g. JSON array or
+ * repeated protobuf field).  The implementation uses two additional value vectors; one to convert
+ * the index offset to the underlying element offset, and another to store the number of values
+ * in the vector.
+ *
+ * NB: this class is automatically generated from ${.template_name} and ValueVectorTypes.tdd using FreeMarker.
+ */
+
+public final class Repeated${minor.class}Vector extends BaseRepeatedValueVector implements Repeated<#if type.major == "VarLen">VariableWidth<#else>FixedWidth</#if>VectorLike {
+  //private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(Repeated${minor.class}Vector.class);
+
+  // we maintain local reference to concrete vector type for performance reasons.
+  private ${minor.class}Vector values;
+  private final FieldReader reader = new Repeated${minor.class}ReaderImpl(Repeated${minor.class}Vector.this);
+  private final Mutator mutator = new Mutator();
+  private final Accessor accessor = new Accessor();
+
+  public Repeated${minor.class}Vector(MaterializedField field, BufferAllocator allocator) {
+    super(field, allocator);
+    addOrGetVector(VectorDescriptor.create(new MajorType(field.getType().getMinorType(), DataMode.REQUIRED)));
+  }
+
+  @Override
+  public Mutator getMutator() {
+    return mutator;
+  }
+
+  @Override
+  public Accessor getAccessor() {
+    return accessor;
+  }
+
+  @Override
+  public FieldReader getReader() {
+    return reader;
+  }
+
+  @Override
+  public ${minor.class}Vector getDataVector() {
+    return values;
+  }
+
+  @Override
+  public TransferPair getTransferPair(BufferAllocator allocator) {
+    return new TransferImpl(getField(), allocator);
+  }
+
+  @Override
+  public TransferPair getTransferPair(String ref, BufferAllocator allocator){
+    return new TransferImpl(getField().withPath(ref), allocator);
+  }
+
+  @Override
+  public TransferPair makeTransferPair(ValueVector to) {
+    return new TransferImpl((Repeated${minor.class}Vector) to);
+  }
+
+  @Override
+  public AddOrGetResult<${minor.class}Vector> addOrGetVector(VectorDescriptor descriptor) {
+    final AddOrGetResult<${minor.class}Vector> result = super.addOrGetVector(descriptor);
+    if (result.isCreated()) {
+      values = result.getVector();
+    }
+    return result;
+  }
+
+  public void transferTo(Repeated${minor.class}Vector target) {
+    target.clear();
+    offsets.transferTo(target.offsets);
+    values.transferTo(target.values);
+    clear();
+  }
+
+  public void splitAndTransferTo(final int startIndex, final int groups, Repeated${minor.class}Vector to) {
+    final UInt4Vector.Accessor a = offsets.getAccessor();
+    final UInt4Vector.Mutator m = to.offsets.getMutator();
+
+    final int startPos = a.get(startIndex);
+    final int endPos = a.get(startIndex + groups);
+    final int valuesToCopy = endPos - startPos;
+
+    values.splitAndTransferTo(startPos, valuesToCopy, to.values);
+    to.offsets.clear();
+    to.offsets.allocateNew(groups + 1);
+    int normalizedPos = 0;
+    for (int i=0; i < groups + 1;i++ ) {
+      normalizedPos = a.get(startIndex+i) - startPos;
+      m.set(i, normalizedPos);
+    }
+    m.setValueCount(groups == 0 ? 0 : groups + 1);
+  }
+
+  private class TransferImpl implements TransferPair {
+    final Repeated${minor.class}Vector to;
+
+    public TransferImpl(MaterializedField field, BufferAllocator allocator) {
+      this.to = new Repeated${minor.class}Vector(field, allocator);
+    }
+
+    public TransferImpl(Repeated${minor.class}Vector to) {
+      this.to = to;
+    }
+
+    @Override
+    public Repeated${minor.class}Vector getTo() {
+      return to;
+    }
+
+    @Override
+    public void transfer() {
+      transferTo(to);
+    }
+
+    @Override
+    public void splitAndTransfer(int startIndex, int length) {
+      splitAndTransferTo(startIndex, length, to);
+    }
+
+    @Override
+    public void copyValueSafe(int fromIndex, int toIndex) {
+      to.copyFromSafe(fromIndex, toIndex, Repeated${minor.class}Vector.this);
+    }
+  }
+
+    public void copyFrom(int inIndex, int outIndex, Repeated${minor.class}Vector v) {
+      final Accessor vAccessor = v.getAccessor();
+      final int count = vAccessor.getInnerValueCountAt(inIndex);
+      mutator.startNewValue(outIndex);
+      for (int i = 0; i < count; i++) {
+        mutator.add(outIndex, vAccessor.get(inIndex, i));
+      }
+    }
+
+    public void copyFromSafe(int inIndex, int outIndex, Repeated${minor.class}Vector v) {
+      final Accessor vAccessor = v.getAccessor();
+      final int count = vAccessor.getInnerValueCountAt(inIndex);
+      mutator.startNewValue(outIndex);
+      for (int i = 0; i < count; i++) {
+        mutator.addSafe(outIndex, vAccessor.get(inIndex, i));
+      }
+    }
+
+  public boolean allocateNewSafe() {
+    /* boolean to keep track if all the memory allocation were successful
+     * Used in the case of composite vectors when we need to allocate multiple
+     * buffers for multiple vectors. If one of the allocations failed we need to
+     * clear all the memory that we allocated
+     */
+    boolean success = false;
+    try {
+      if(!offsets.allocateNewSafe()) return false;
+      if(!values.allocateNewSafe()) return false;
+      success = true;
+    } finally {
+      if (!success) {
+        clear();
+      }
+    }
+    offsets.zeroVector();
+    mutator.reset();
+    return true;
+  }
+
+  @Override
+  public void allocateNew() {
+    try {
+      offsets.allocateNew();
+      values.allocateNew();
+    } catch (OutOfMemoryException e) {
+      clear();
+      throw e;
+    }
+    offsets.zeroVector();
+    mutator.reset();
+  }
+
+  <#if type.major == "VarLen">
+//  @Override
+//  protected SerializedField.Builder getMetadataBuilder() {
+//    return super.getMetadataBuilder()
+//            .setVarByteLength(values.getVarByteLength());
+//  }
+
+  public void allocateNew(int totalBytes, int valueCount, int innerValueCount) {
+    try {
+      offsets.allocateNew(valueCount + 1);
+      values.allocateNew(totalBytes, innerValueCount);
+    } catch (OutOfMemoryException e) {
+      clear();
+      throw e;
+    }
+    offsets.zeroVector();
+    mutator.reset();
+  }
+
+  public int getByteCapacity(){
+    return values.getByteCapacity();
+  }
+
+  <#else>
+
+  @Override
+  public void allocateNew(int valueCount, int innerValueCount) {
+    clear();
+    /* boolean to keep track if all the memory allocation were successful
+     * Used in the case of composite vectors when we need to allocate multiple
+     * buffers for multiple vectors. If one of the allocations failed we need to//
+     * clear all the memory that we allocated
+     */
+    boolean success = false;
+    try {
+      offsets.allocateNew(valueCount + 1);
+      values.allocateNew(innerValueCount);
+    } catch(OutOfMemoryException e){
+      clear();
+      throw e;
+    }
+    offsets.zeroVector();
+    mutator.reset();
+  }
+
+  </#if>
+
+  // This is declared a subclass of the accessor declared inside of FixedWidthVector, this is also used for
+  // variable length vectors, as they should ahve consistent interface as much as possible, if they need to diverge
+  // in the future, the interface shold be declared in the respective value vector superclasses for fixed and variable
+  // and we should refer to each in the generation template
+  public final class Accessor extends BaseRepeatedValueVector.BaseRepeatedAccessor {
+    @Override
+    public List<${friendlyType}> getObject(int index) {
+      final List<${friendlyType}> vals = new JsonStringArrayList<>();
+      final UInt4Vector.Accessor offsetsAccessor = offsets.getAccessor();
+      final int start = offsetsAccessor.get(index);
+      final int end = offsetsAccessor.get(index + 1);
+      final ${minor.class}Vector.Accessor valuesAccessor = values.getAccessor();
+      for(int i = start; i < end; i++) {
+        vals.add(valuesAccessor.getObject(i));
+      }
+      return vals;
+    }
+
+    public ${friendlyType} getSingleObject(int index, int arrayIndex) {
+      final int start = offsets.getAccessor().get(index);
+      return values.getAccessor().getObject(start + arrayIndex);
+    }
+
+    /**
+     * Get a value for the given record.  Each element in the repeated field is accessed by
+     * the positionIndex param.
+     *
+     * @param  index           record containing the repeated field
+     * @param  positionIndex   position within the repeated field
+     * @return element at the given position in the given record
+     */
+    public <#if type.major == "VarLen">byte[]
+           <#else>${minor.javaType!type.javaType}
+           </#if> get(int index, int positionIndex) {
+      return values.getAccessor().get(offsets.getAccessor().get(index) + positionIndex);
+    }
+
+    public void get(int index, Repeated${minor.class}Holder holder) {
+      holder.start = offsets.getAccessor().get(index);
+      holder.end =  offsets.getAccessor().get(index+1);
+      holder.vector = values;
+    }
+
+    public void get(int index, int positionIndex, ${minor.class}Holder holder) {
+      final int offset = offsets.getAccessor().get(index);
+      assert offset >= 0;
+      assert positionIndex < getInnerValueCountAt(index);
+      values.getAccessor().get(offset + positionIndex, holder);
+    }
+
+    public void get(int index, int positionIndex, Nullable${minor.class}Holder holder) {
+      final int offset = offsets.getAccessor().get(index);
+      assert offset >= 0;
+      if (positionIndex >= getInnerValueCountAt(index)) {
+        holder.isSet = 0;
+        return;
+      }
+      values.getAccessor().get(offset + positionIndex, holder);
+    }
+  }
+
+  public final class Mutator extends BaseRepeatedValueVector.BaseRepeatedMutator implements RepeatedMutator {
+    private Mutator() {}
+
+    /**
+     * Add an element to the given record index.  This is similar to the set() method in other
+     * value vectors, except that it permits setting multiple values for a single record.
+     *
+     * @param index   record of the element to add
+     * @param value   value to add to the given row
+     */
+    public void add(int index, <#if type.major == "VarLen">byte[]<#elseif (type.width < 4)>int<#else>${minor.javaType!type.javaType}</#if> value) {
+      int nextOffset = offsets.getAccessor().get(index+1);
+      values.getMutator().set(nextOffset, value);
+      offsets.getMutator().set(index+1, nextOffset+1);
+    }
+
+    <#if type.major == "VarLen">
+    public void addSafe(int index, byte[] bytes) {
+      addSafe(index, bytes, 0, bytes.length);
+    }
+
+    public void addSafe(int index, byte[] bytes, int start, int length) {
+      final int nextOffset = offsets.getAccessor().get(index+1);
+      values.getMutator().setSafe(nextOffset, bytes, start, length);
+      offsets.getMutator().setSafe(index+1, nextOffset+1);
+    }
+
+    <#else>
+
+    public void addSafe(int index, ${minor.javaType!type.javaType} srcValue) {
+      final int nextOffset = offsets.getAccessor().get(index+1);
+      values.getMutator().setSafe(nextOffset, srcValue);
+      offsets.getMutator().setSafe(index+1, nextOffset+1);
+    }
+
+    </#if>
+
+    public void setSafe(int index, Repeated${minor.class}Holder h) {
+      final ${minor.class}Holder ih = new ${minor.class}Holder();
+      final ${minor.class}Vector.Accessor hVectorAccessor = h.vector.getAccessor();
+      mutator.startNewValue(index);
+      for(int i = h.start; i < h.end; i++){
+        hVectorAccessor.get(i, ih);
+        mutator.addSafe(index, ih);
+      }
+    }
+
+    public void addSafe(int index, ${minor.class}Holder holder) {
+      int nextOffset = offsets.getAccessor().get(index+1);
+      values.getMutator().setSafe(nextOffset, holder);
+      offsets.getMutator().setSafe(index+1, nextOffset+1);
+    }
+
+    public void addSafe(int index, Nullable${minor.class}Holder holder) {
+      final int nextOffset = offsets.getAccessor().get(index+1);
+      values.getMutator().setSafe(nextOffset, holder);
+      offsets.getMutator().setSafe(index+1, nextOffset+1);
+    }
+
+    <#if (fields?size > 1) && !(minor.class == "Decimal9" || minor.class == "Decimal18" || minor.class == "Decimal28Sparse" || minor.class == "Decimal38Sparse" || minor.class == "Decimal28Dense" || minor.class == "Decimal38Dense")>
+    public void addSafe(int arrayIndex, <#list fields as field>${field.type} ${field.name}<#if field_has_next>, </#if></#list>) {
+      int nextOffset = offsets.getAccessor().get(arrayIndex+1);
+      values.getMutator().setSafe(nextOffset, <#list fields as field>${field.name}<#if field_has_next>, </#if></#list>);
+      offsets.getMutator().setSafe(arrayIndex+1, nextOffset+1);
+    }
+    </#if>
+
+    protected void add(int index, ${minor.class}Holder holder) {
+      int nextOffset = offsets.getAccessor().get(index+1);
+      values.getMutator().set(nextOffset, holder);
+      offsets.getMutator().set(index+1, nextOffset+1);
+    }
+
+    public void add(int index, Repeated${minor.class}Holder holder) {
+
+      ${minor.class}Vector.Accessor accessor = holder.vector.getAccessor();
+      ${minor.class}Holder innerHolder = new ${minor.class}Holder();
+
+      for(int i = holder.start; i < holder.end; i++) {
+        accessor.get(i, innerHolder);
+        add(index, innerHolder);
+      }
+    }
+
+    @Override
+    public void generateTestData(final int valCount) {
+      final int[] sizes = {1, 2, 0, 6};
+      int size = 0;
+      int runningOffset = 0;
+      final UInt4Vector.Mutator offsetsMutator = offsets.getMutator();
+      for(int i = 1; i < valCount + 1; i++, size++) {
+        runningOffset += sizes[size % sizes.length];
+        offsetsMutator.set(i, runningOffset);
+      }
+      values.getMutator().generateTestData(valCount * 9);
+      setValueCount(size);
+    }
+
+    @Override
+    public void reset() {
+    }
+  }
+}
+</#list>
+</#list>

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/UnionListWriter.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/UnionListWriter.java b/java/vector/src/main/codegen/templates/UnionListWriter.java
new file mode 100644
index 0000000..9a6b08f
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/UnionListWriter.java
@@ -0,0 +1,185 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import java.lang.UnsupportedOperationException;
+
+<@pp.dropOutputFile />
+<@pp.changeOutputFile name="/org/apache/arrow/vector/complex/impl/UnionListWriter.java" />
+
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.complex.impl;
+
+<#include "/@includes/vv_imports.ftl" />
+
+/*
+ * This class is generated using freemarker and the ${.template_name} template.
+ */
+
+@SuppressWarnings("unused")
+public class UnionListWriter extends AbstractFieldWriter {
+
+  private ListVector vector;
+  private UInt4Vector offsets;
+  private PromotableWriter writer;
+  private boolean inMap = false;
+  private String mapName;
+  private int lastIndex = 0;
+
+  public UnionListWriter(ListVector vector) {
+    super(null);
+    this.vector = vector;
+    this.writer = new PromotableWriter(vector.getDataVector(), vector);
+    this.offsets = vector.getOffsetVector();
+  }
+
+  public UnionListWriter(ListVector vector, AbstractFieldWriter parent) {
+    this(vector);
+  }
+
+  @Override
+  public void allocate() {
+    vector.allocateNew();
+  }
+
+  @Override
+  public void clear() {
+    vector.clear();
+  }
+
+  @Override
+  public MaterializedField getField() {
+    return null;
+  }
+
+  @Override
+  public int getValueCapacity() {
+    return vector.getValueCapacity();
+  }
+
+  @Override
+  public void close() throws Exception {
+
+  }
+
+  <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first />
+  <#assign fields = minor.fields!type.fields />
+  <#assign uncappedName = name?uncap_first/>
+
+  <#if !minor.class?starts_with("Decimal")>
+
+  @Override
+  public ${name}Writer <#if uncappedName == "int">integer<#else>${uncappedName}</#if>() {
+    return this;
+  }
+
+  @Override
+  public ${name}Writer <#if uncappedName == "int">integer<#else>${uncappedName}</#if>(String name) {
+    assert inMap;
+    mapName = name;
+    final int nextOffset = offsets.getAccessor().get(idx() + 1);
+    vector.getMutator().setNotNull(idx());
+    writer.setPosition(nextOffset);
+    ${name}Writer ${uncappedName}Writer = writer.<#if uncappedName == "int">integer<#else>${uncappedName}</#if>(name);
+    return ${uncappedName}Writer;
+  }
+
+  </#if>
+
+  </#list></#list>
+
+  @Override
+  public MapWriter map() {
+    inMap = true;
+    return this;
+  }
+
+  @Override
+  public ListWriter list() {
+    final int nextOffset = offsets.getAccessor().get(idx() + 1);
+    vector.getMutator().setNotNull(idx());
+    offsets.getMutator().setSafe(idx() + 1, nextOffset + 1);
+    writer.setPosition(nextOffset);
+    return writer;
+  }
+
+  @Override
+  public ListWriter list(String name) {
+    final int nextOffset = offsets.getAccessor().get(idx() + 1);
+    vector.getMutator().setNotNull(idx());
+    writer.setPosition(nextOffset);
+    ListWriter listWriter = writer.list(name);
+    return listWriter;
+  }
+
+  @Override
+  public MapWriter map(String name) {
+    MapWriter mapWriter = writer.map(name);
+    return mapWriter;
+  }
+
+  @Override
+  public void startList() {
+    vector.getMutator().startNewValue(idx());
+  }
+
+  @Override
+  public void endList() {
+
+  }
+
+  @Override
+  public void start() {
+    assert inMap;
+    final int nextOffset = offsets.getAccessor().get(idx() + 1);
+    vector.getMutator().setNotNull(idx());
+    offsets.getMutator().setSafe(idx() + 1, nextOffset);
+    writer.setPosition(nextOffset);
+  }
+
+  @Override
+  public void end() {
+    if (inMap) {
+      inMap = false;
+      final int nextOffset = offsets.getAccessor().get(idx() + 1);
+      offsets.getMutator().setSafe(idx() + 1, nextOffset + 1);
+    }
+  }
+
+  <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first />
+  <#assign fields = minor.fields!type.fields />
+  <#assign uncappedName = name?uncap_first/>
+
+  <#if !minor.class?starts_with("Decimal")>
+
+  @Override
+  public void write${name}(<#list fields as field>${field.type} ${field.name}<#if field_has_next>, </#if></#list>) {
+    assert !inMap;
+    final int nextOffset = offsets.getAccessor().get(idx() + 1);
+    vector.getMutator().setNotNull(idx());
+    writer.setPosition(nextOffset);
+    writer.write${name}(<#list fields as field>${field.name}<#if field_has_next>, </#if></#list>);
+    offsets.getMutator().setSafe(idx() + 1, nextOffset + 1);
+  }
+
+  </#if>
+
+  </#list></#list>
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/UnionReader.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/UnionReader.java b/java/vector/src/main/codegen/templates/UnionReader.java
new file mode 100644
index 0000000..44c3e55
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/UnionReader.java
@@ -0,0 +1,194 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+<@pp.dropOutputFile />
+<@pp.changeOutputFile name="/org/apache/arrow/vector/complex/impl/UnionReader.java" />
+
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.complex.impl;
+
+<#include "/@includes/vv_imports.ftl" />
+
+@SuppressWarnings("unused")
+public class UnionReader extends AbstractFieldReader {
+
+  private BaseReader[] readers = new BaseReader[43];
+  public UnionVector data;
+  
+  public UnionReader(UnionVector data) {
+    this.data = data;
+  }
+
+  private static MajorType[] TYPES = new MajorType[43];
+
+  static {
+    for (MinorType minorType : MinorType.values()) {
+      TYPES[minorType.ordinal()] = new MajorType(minorType, DataMode.OPTIONAL);
+    }
+  }
+
+  public MajorType getType() {
+    return TYPES[data.getTypeValue(idx())];
+  }
+
+  public boolean isSet(){
+    return !data.getAccessor().isNull(idx());
+  }
+
+  public void read(UnionHolder holder) {
+    holder.reader = this;
+    holder.isSet = this.isSet() ? 1 : 0;
+  }
+
+  public void read(int index, UnionHolder holder) {
+    getList().read(index, holder);
+  }
+
+  private FieldReader getReaderForIndex(int index) {
+    int typeValue = data.getTypeValue(index);
+    FieldReader reader = (FieldReader) readers[typeValue];
+    if (reader != null) {
+      return reader;
+    }
+    switch (MinorType.values()[typeValue]) {
+    case LATE:
+      return NullReader.INSTANCE;
+    case MAP:
+      return (FieldReader) getMap();
+    case LIST:
+      return (FieldReader) getList();
+    <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first />
+    <#assign uncappedName = name?uncap_first/>
+    <#if !minor.class?starts_with("Decimal")>
+    case ${name?upper_case}:
+      return (FieldReader) get${name}();
+    </#if>
+    </#list></#list>
+    default:
+      throw new UnsupportedOperationException("Unsupported type: " + MinorType.values()[typeValue]);
+    }
+  }
+
+  private SingleMapReaderImpl mapReader;
+
+  private MapReader getMap() {
+    if (mapReader == null) {
+      mapReader = (SingleMapReaderImpl) data.getMap().getReader();
+      mapReader.setPosition(idx());
+      readers[MinorType.MAP.ordinal()] = mapReader;
+    }
+    return mapReader;
+  }
+
+  private UnionListReader listReader;
+
+  private FieldReader getList() {
+    if (listReader == null) {
+      listReader = new UnionListReader(data.getList());
+      listReader.setPosition(idx());
+      readers[MinorType.LIST.ordinal()] = listReader;
+    }
+    return listReader;
+  }
+
+  @Override
+  public java.util.Iterator<String> iterator() {
+    return getMap().iterator();
+  }
+
+  @Override
+  public void copyAsValue(UnionWriter writer) {
+    writer.data.copyFrom(idx(), writer.idx(), data);
+  }
+
+  <#list ["Object", "BigDecimal", "Integer", "Long", "Boolean",
+          "Character", "DateTime", "Period", "Double", "Float",
+          "Text", "String", "Byte", "Short", "byte[]"] as friendlyType>
+  <#assign safeType=friendlyType />
+  <#if safeType=="byte[]"><#assign safeType="ByteArray" /></#if>
+
+  @Override
+  public ${friendlyType} read${safeType}() {
+    return getReaderForIndex(idx()).read${safeType}();
+  }
+
+  </#list>
+
+  <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first />
+          <#assign uncappedName = name?uncap_first/>
+  <#assign boxedType = (minor.boxedType!type.boxedType) />
+  <#assign javaType = (minor.javaType!type.javaType) />
+  <#assign friendlyType = (minor.friendlyType!minor.boxedType!type.boxedType) />
+  <#assign safeType=friendlyType />
+  <#if safeType=="byte[]"><#assign safeType="ByteArray" /></#if>
+  <#if !minor.class?starts_with("Decimal")>
+
+  private Nullable${name}ReaderImpl ${uncappedName}Reader;
+
+  private Nullable${name}ReaderImpl get${name}() {
+    if (${uncappedName}Reader == null) {
+      ${uncappedName}Reader = new Nullable${name}ReaderImpl(data.get${name}Vector());
+      ${uncappedName}Reader.setPosition(idx());
+      readers[MinorType.${name?upper_case}.ordinal()] = ${uncappedName}Reader;
+    }
+    return ${uncappedName}Reader;
+  }
+
+  public void read(Nullable${name}Holder holder){
+    getReaderForIndex(idx()).read(holder);
+  }
+
+  public void copyAsValue(${name}Writer writer){
+    getReaderForIndex(idx()).copyAsValue(writer);
+  }
+  </#if>
+  </#list></#list>
+
+  @Override
+  public void copyAsValue(ListWriter writer) {
+    ComplexCopier.copy(this, (FieldWriter) writer);
+  }
+
+  @Override
+  public void setPosition(int index) {
+    super.setPosition(index);
+    for (BaseReader reader : readers) {
+      if (reader != null) {
+        reader.setPosition(index);
+      }
+    }
+  }
+  
+  public FieldReader reader(String name){
+    return getMap().reader(name);
+  }
+
+  public FieldReader reader() {
+    return getList().reader();
+  }
+
+  public boolean next() {
+    return getReaderForIndex(idx()).next();
+  }
+}
+
+
+


[12/17] arrow git commit: Update readme and add license in root.

Posted by ja...@apache.org.
Update readme and add license in root.


Project: http://git-wip-us.apache.org/repos/asf/arrow/repo
Commit: http://git-wip-us.apache.org/repos/asf/arrow/commit/cbc56bf8
Tree: http://git-wip-us.apache.org/repos/asf/arrow/tree/cbc56bf8
Diff: http://git-wip-us.apache.org/repos/asf/arrow/diff/cbc56bf8

Branch: refs/heads/master
Commit: cbc56bf8ac423c585c782d5eda5c517ea8df8e3c
Parents: d5aa7c4
Author: Jacques Nadeau <ja...@apache.org>
Authored: Tue Feb 16 21:35:38 2016 -0800
Committer: Jacques Nadeau <ja...@apache.org>
Committed: Wed Feb 17 04:38:39 2016 -0800

----------------------------------------------------------------------
 LICENSE.txt | 202 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 README.md   |  14 +++-
 2 files changed, 215 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/arrow/blob/cbc56bf8/LICENSE.txt
----------------------------------------------------------------------
diff --git a/LICENSE.txt b/LICENSE.txt
new file mode 100644
index 0000000..d645695
--- /dev/null
+++ b/LICENSE.txt
@@ -0,0 +1,202 @@
+
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+   1. Definitions.
+
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+
+   END OF TERMS AND CONDITIONS
+
+   APPENDIX: How to apply the Apache License to your work.
+
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+
+   Copyright [yyyy] [name of copyright owner]
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.

http://git-wip-us.apache.org/repos/asf/arrow/blob/cbc56bf8/README.md
----------------------------------------------------------------------
diff --git a/README.md b/README.md
index e2dc747..4423a91 100644
--- a/README.md
+++ b/README.md
@@ -1 +1,13 @@
-arrow
+## Apache Arrow
+
+#### Powering Columnar In-Memory Analytics
+
+Arrow is a set of technologies that enable big-data systems to process and move data fast.
+
+Initial implementations include:
+
+ - [The Arrow Format](https://github.com/apache/arrow/tree/master/format)
+ - [Arrow Structures and APIs in C++](https://github.com/apache/arrow/tree/master/cpp)
+ - [Arrow Structures and APIs in Java](https://github.com/apache/arrow/tree/master/java)
+
+Arrow is an [Apache Software Foundation](www.apache.org) project. More info can be found at [arrow.apache.org](http://arrow.apache.org).


[03/17] arrow git commit: ARROW-1: Initial Arrow Code Commit

Posted by ja...@apache.org.
http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/reader/FieldReader.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/reader/FieldReader.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/reader/FieldReader.java
new file mode 100644
index 0000000..c4eb3dc
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/reader/FieldReader.java
@@ -0,0 +1,29 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex.reader;
+
+import org.apache.arrow.vector.complex.reader.BaseReader.ListReader;
+import org.apache.arrow.vector.complex.reader.BaseReader.MapReader;
+import org.apache.arrow.vector.complex.reader.BaseReader.RepeatedListReader;
+import org.apache.arrow.vector.complex.reader.BaseReader.RepeatedMapReader;
+import org.apache.arrow.vector.complex.reader.BaseReader.ScalarReader;
+
+
+
+public interface FieldReader extends MapReader, ListReader, ScalarReader, RepeatedMapReader, RepeatedListReader {
+}
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/writer/FieldWriter.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/writer/FieldWriter.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/writer/FieldWriter.java
new file mode 100644
index 0000000..ecffe0b
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/writer/FieldWriter.java
@@ -0,0 +1,27 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex.writer;
+
+import org.apache.arrow.vector.complex.writer.BaseWriter.ListWriter;
+import org.apache.arrow.vector.complex.writer.BaseWriter.MapWriter;
+import org.apache.arrow.vector.complex.writer.BaseWriter.ScalarWriter;
+
+public interface FieldWriter extends MapWriter, ListWriter, ScalarWriter {
+  void allocate();
+  void clear();
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/holders/ComplexHolder.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/holders/ComplexHolder.java b/java/vector/src/main/java/org/apache/arrow/vector/holders/ComplexHolder.java
new file mode 100644
index 0000000..0f9310d
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/holders/ComplexHolder.java
@@ -0,0 +1,25 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.holders;
+
+import org.apache.arrow.vector.complex.reader.FieldReader;
+
+public class ComplexHolder implements ValueHolder {
+  public FieldReader reader;
+  public int isSet;
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/holders/ObjectHolder.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/holders/ObjectHolder.java b/java/vector/src/main/java/org/apache/arrow/vector/holders/ObjectHolder.java
new file mode 100644
index 0000000..5a5fe03
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/holders/ObjectHolder.java
@@ -0,0 +1,38 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.vector.holders;
+
+import org.apache.arrow.vector.types.Types;
+
+/*
+ * Holder class for the vector ObjectVector. This holder internally stores a
+ * reference to an object. The ObjectVector maintains an array of these objects.
+ * This holder can be used only as workspace variables in aggregate functions.
+ * Using this holder should be avoided and we should stick to native holder types.
+ */
+@Deprecated
+public class ObjectHolder implements ValueHolder {
+  public static final Types.MajorType TYPE = Types.required(Types.MinorType.GENERIC_OBJECT);
+
+  public Types.MajorType getType() {
+    return TYPE;
+  }
+
+  public Object obj;
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/holders/RepeatedListHolder.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/holders/RepeatedListHolder.java b/java/vector/src/main/java/org/apache/arrow/vector/holders/RepeatedListHolder.java
new file mode 100644
index 0000000..83506cd
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/holders/RepeatedListHolder.java
@@ -0,0 +1,23 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.holders;
+
+public final class RepeatedListHolder implements ValueHolder{
+  public int start;
+  public int end;
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/holders/RepeatedMapHolder.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/holders/RepeatedMapHolder.java b/java/vector/src/main/java/org/apache/arrow/vector/holders/RepeatedMapHolder.java
new file mode 100644
index 0000000..85d782b
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/holders/RepeatedMapHolder.java
@@ -0,0 +1,23 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.holders;
+
+public final class RepeatedMapHolder implements ValueHolder{
+  public int start;
+  public int end;
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/holders/UnionHolder.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/holders/UnionHolder.java b/java/vector/src/main/java/org/apache/arrow/vector/holders/UnionHolder.java
new file mode 100644
index 0000000..b868a62
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/holders/UnionHolder.java
@@ -0,0 +1,37 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.holders;
+
+import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.types.Types.DataMode;
+import org.apache.arrow.vector.types.Types.MajorType;
+import org.apache.arrow.vector.types.Types.MinorType;
+
+public class UnionHolder implements ValueHolder {
+  public static final MajorType TYPE = new MajorType(MinorType.UNION, DataMode.OPTIONAL);
+  public FieldReader reader;
+  public int isSet;
+
+  public MajorType getType() {
+    return reader.getType();
+  }
+
+  public boolean isSet() {
+    return isSet == 1;
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/holders/ValueHolder.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/holders/ValueHolder.java b/java/vector/src/main/java/org/apache/arrow/vector/holders/ValueHolder.java
new file mode 100644
index 0000000..88cbcd4
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/holders/ValueHolder.java
@@ -0,0 +1,31 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.holders;
+
+/**
+ * Wrapper object for an individual value in Drill.
+ *
+ * ValueHolders are designed to be mutable wrapper objects for defining clean
+ * APIs that access data in Drill. For performance, object creation is avoided
+ * at all costs throughout execution. For this reason, ValueHolders are
+ * disallowed from implementing any methods, this allows for them to be
+ * replaced by their java primitive inner members during optimization of
+ * run-time generated code.
+ */
+public interface ValueHolder {
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/types/MaterializedField.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/types/MaterializedField.java b/java/vector/src/main/java/org/apache/arrow/vector/types/MaterializedField.java
new file mode 100644
index 0000000..c73098b
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/types/MaterializedField.java
@@ -0,0 +1,217 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.types;
+
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.LinkedHashSet;
+import java.util.Objects;
+
+import org.apache.arrow.vector.types.Types.DataMode;
+import org.apache.arrow.vector.types.Types.MajorType;
+import org.apache.arrow.vector.util.BasicTypeHelper;
+
+
+public class MaterializedField {
+  private final String name;
+  private final MajorType type;
+  // use an ordered set as existing code relies on order (e,g. parquet writer)
+  private final LinkedHashSet<MaterializedField> children;
+
+  MaterializedField(String name, MajorType type, LinkedHashSet<MaterializedField> children) {
+    this.name = name;
+    this.type = type;
+    this.children = children;
+  }
+
+  public Collection<MaterializedField> getChildren() {
+    return new ArrayList<>(children);
+  }
+
+  public MaterializedField newWithChild(MaterializedField child) {
+    MaterializedField newField = clone();
+    newField.addChild(child);
+    return newField;
+  }
+
+  public void addChild(MaterializedField field){
+    children.add(field);
+  }
+
+  public MaterializedField clone() {
+    return withPathAndType(name, getType());
+  }
+
+  public MaterializedField withType(MajorType type) {
+    return withPathAndType(name, type);
+  }
+
+  public MaterializedField withPath(String name) {
+    return withPathAndType(name, getType());
+  }
+
+  public MaterializedField withPathAndType(String name, final MajorType type) {
+    final LinkedHashSet<MaterializedField> newChildren = new LinkedHashSet<>(children.size());
+    for (final MaterializedField child:children) {
+      newChildren.add(child.clone());
+    }
+    return new MaterializedField(name, type, newChildren);
+  }
+
+//  public String getLastName(){
+//    PathSegment seg = key.path.getRootSegment();
+//    while (seg.getChild() != null) {
+//      seg = seg.getChild();
+//    }
+//    return seg.getNameSegment().getPath();
+//  }
+
+
+  // TODO: rewrite without as direct match rather than conversion then match.
+//  public boolean matches(SerializedField booleanfield){
+//    MaterializedField f = create(field);
+//    return f.equals(this);
+//  }
+
+  public static MaterializedField create(String name, MajorType type){
+    return new MaterializedField(name, type, new LinkedHashSet<MaterializedField>());
+  }
+
+//  public String getName(){
+//    StringBuilder sb = new StringBuilder();
+//    boolean first = true;
+//    for(NamePart np : def.getNameList()){
+//      if(np.getType() == Type.ARRAY){
+//        sb.append("[]");
+//      }else{
+//        if(first){
+//          first = false;
+//        }else{
+//          sb.append(".");
+//        }
+//        sb.append('`');
+//        sb.append(np.getName());
+//        sb.append('`');
+//
+//      }
+//    }
+//    return sb.toString();
+//  }
+
+  public String getPath() {
+    return getName();
+  }
+
+  public String getLastName() {
+    return getName();
+  }
+
+  public String getName() {
+    return name;
+  }
+
+//  public int getWidth() {
+//    return type.getWidth();
+//  }
+
+  public MajorType getType() {
+    return type;
+  }
+
+  public int getScale() {
+      return type.getScale();
+  }
+  public int getPrecision() {
+      return type.getPrecision();
+  }
+  public boolean isNullable() {
+    return type.getMode() == DataMode.OPTIONAL;
+  }
+
+  public DataMode getDataMode() {
+    return type.getMode();
+  }
+
+  public MaterializedField getOtherNullableVersion(){
+    MajorType mt = type;
+    DataMode newDataMode = null;
+    switch (mt.getMode()){
+    case OPTIONAL:
+      newDataMode = DataMode.REQUIRED;
+      break;
+    case REQUIRED:
+      newDataMode = DataMode.OPTIONAL;
+      break;
+    default:
+      throw new UnsupportedOperationException();
+    }
+    return new MaterializedField(name, new MajorType(mt.getMinorType(), newDataMode, mt.getPrecision(), mt.getScale(), mt.getTimezone(), mt.getSubTypes()), children);
+  }
+
+  public Class<?> getValueClass() {
+    return BasicTypeHelper.getValueVectorClass(getType().getMinorType(), getDataMode());
+  }
+
+  @Override
+  public int hashCode() {
+    return Objects.hash(this.name, this.type, this.children);
+  }
+
+  @Override
+  public boolean equals(Object obj) {
+    if (this == obj) {
+      return true;
+    }
+    if (obj == null) {
+      return false;
+    }
+    if (getClass() != obj.getClass()) {
+      return false;
+    }
+    MaterializedField other = (MaterializedField) obj;
+    // DRILL-1872: Compute equals only on key. See also the comment
+    // in MapVector$MapTransferPair
+
+    return this.name.equalsIgnoreCase(other.name) &&
+            Objects.equals(this.type, other.type);
+  }
+
+
+  @Override
+  public String toString() {
+    final int maxLen = 10;
+    String childStr = children != null && !children.isEmpty() ? toString(children, maxLen) : "";
+    return name + "(" + type.getMinorType().name() + ":" + type.getMode().name() + ")" + childStr;
+  }
+
+
+  private String toString(Collection<?> collection, int maxLen) {
+    StringBuilder builder = new StringBuilder();
+    builder.append("[");
+    int i = 0;
+    for (Iterator<?> iterator = collection.iterator(); iterator.hasNext() && i < maxLen; i++) {
+      if (i > 0){
+        builder.append(", ");
+      }
+      builder.append(iterator.next());
+    }
+    builder.append("]");
+    return builder.toString();
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/types/Types.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/types/Types.java b/java/vector/src/main/java/org/apache/arrow/vector/types/Types.java
new file mode 100644
index 0000000..cef892c
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/types/Types.java
@@ -0,0 +1,132 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.types;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+
+public class Types {
+  public enum MinorType {
+    LATE,   //  late binding type
+    MAP,   //  an empty map column.  Useful for conceptual setup.  Children listed within here
+
+    TINYINT,   //  single byte signed integer
+    SMALLINT,   //  two byte signed integer
+    INT,   //  four byte signed integer
+    BIGINT,   //  eight byte signed integer
+    DECIMAL9,   //  a decimal supporting precision between 1 and 9
+    DECIMAL18,   //  a decimal supporting precision between 10 and 18
+    DECIMAL28SPARSE,   //  a decimal supporting precision between 19 and 28
+    DECIMAL38SPARSE,   //  a decimal supporting precision between 29 and 38
+    MONEY,   //  signed decimal with two digit precision
+    DATE,   //  days since 4713bc
+    TIME,   //  time in micros before or after 2000/1/1
+    TIMETZ,  //  time in micros before or after 2000/1/1 with timezone
+    TIMESTAMPTZ,   //  unix epoch time in millis
+    TIMESTAMP,   //  TBD
+    INTERVAL,   //  TBD
+    FLOAT4,   //  4 byte ieee 754
+    FLOAT8,   //  8 byte ieee 754
+    BIT,  //  single bit value (boolean)
+    FIXEDCHAR,  //  utf8 fixed length string, padded with spaces
+    FIXED16CHAR,
+    FIXEDBINARY,   //  fixed length binary, padded with 0 bytes
+    VARCHAR,   //  utf8 variable length string
+    VAR16CHAR, // utf16 variable length string
+    VARBINARY,   //  variable length binary
+    UINT1,  //  unsigned 1 byte integer
+    UINT2,  //  unsigned 2 byte integer
+    UINT4,   //  unsigned 4 byte integer
+    UINT8,   //  unsigned 8 byte integer
+    DECIMAL28DENSE, // dense decimal representation, supporting precision between 19 and 28
+    DECIMAL38DENSE, // dense decimal representation, supporting precision between 28 and 38
+    NULL, // a value of unknown type (e.g. a missing reference).
+    INTERVALYEAR, // Interval type specifying YEAR to MONTH
+    INTERVALDAY, // Interval type specifying DAY to SECONDS
+    LIST,
+    GENERIC_OBJECT,
+    UNION
+  }
+
+  public enum DataMode {
+    REQUIRED,
+    OPTIONAL,
+    REPEATED
+  }
+
+  public static class MajorType {
+    private MinorType minorType;
+    private DataMode mode;
+    private Integer precision;
+    private Integer scale;
+    private Integer timezone;
+    private List<MinorType> subTypes;
+
+    public MajorType(MinorType minorType, DataMode mode) {
+      this(minorType, mode, null, null, null, null);
+    }
+
+    public MajorType(MinorType minorType, DataMode mode, Integer precision, Integer scale) {
+      this(minorType, mode, precision, scale, null, null);
+    }
+
+    public MajorType(MinorType minorType, DataMode mode, Integer precision, Integer scale, Integer timezone, List<MinorType> subTypes) {
+      this.minorType = minorType;
+      this.mode = mode;
+      this.precision = precision;
+      this.scale = scale;
+      this.timezone = timezone;
+      this.subTypes = subTypes;
+    }
+
+    public MinorType getMinorType() {
+      return minorType;
+    }
+
+    public DataMode getMode() {
+      return mode;
+    }
+
+    public Integer getPrecision() {
+      return precision;
+    }
+
+    public Integer getScale() {
+      return scale;
+    }
+
+    public Integer getTimezone() {
+      return timezone;
+    }
+
+    public List<MinorType> getSubTypes() {
+      return subTypes;
+    }
+  }
+
+  public static MajorType required(MinorType minorType) {
+    return new MajorType(minorType, DataMode.REQUIRED);
+  }
+  public static MajorType optional(MinorType minorType) {
+    return new MajorType(minorType, DataMode.OPTIONAL);
+  }
+  public static MajorType repeated(MinorType minorType) {
+    return new MajorType(minorType, DataMode.REPEATED);
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/util/ByteFunctionHelpers.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/util/ByteFunctionHelpers.java b/java/vector/src/main/java/org/apache/arrow/vector/util/ByteFunctionHelpers.java
new file mode 100644
index 0000000..2bdfd70
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/util/ByteFunctionHelpers.java
@@ -0,0 +1,233 @@
+/*******************************************************************************
+
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ ******************************************************************************/
+package org.apache.arrow.vector.util;
+
+import io.netty.buffer.ArrowBuf;
+import io.netty.util.internal.PlatformDependent;
+
+import org.apache.arrow.memory.BoundsChecking;
+
+import com.google.common.primitives.UnsignedLongs;
+
+public class ByteFunctionHelpers {
+  static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(ByteFunctionHelpers.class);
+
+  /**
+   * Helper function to check for equality of bytes in two DrillBuffers
+   *
+   * @param left Left DrillBuf for comparison
+   * @param lStart start offset in the buffer
+   * @param lEnd end offset in the buffer
+   * @param right Right DrillBuf for comparison
+   * @param rStart start offset in the buffer
+   * @param rEnd end offset in the buffer
+   * @return 1 if left input is greater, -1 if left input is smaller, 0 otherwise
+   */
+  public static final int equal(final ArrowBuf left, int lStart, int lEnd, final ArrowBuf right, int rStart, int rEnd){
+    if (BoundsChecking.BOUNDS_CHECKING_ENABLED) {
+      left.checkBytes(lStart, lEnd);
+      right.checkBytes(rStart, rEnd);
+    }
+    return memEqual(left.memoryAddress(), lStart, lEnd, right.memoryAddress(), rStart, rEnd);
+  }
+
+  private static final int memEqual(final long laddr, int lStart, int lEnd, final long raddr, int rStart,
+      final int rEnd) {
+
+    int n = lEnd - lStart;
+    if (n == rEnd - rStart) {
+      long lPos = laddr + lStart;
+      long rPos = raddr + rStart;
+
+      while (n > 7) {
+        long leftLong = PlatformDependent.getLong(lPos);
+        long rightLong = PlatformDependent.getLong(rPos);
+        if (leftLong != rightLong) {
+          return 0;
+        }
+        lPos += 8;
+        rPos += 8;
+        n -= 8;
+      }
+      while (n-- != 0) {
+        byte leftByte = PlatformDependent.getByte(lPos);
+        byte rightByte = PlatformDependent.getByte(rPos);
+        if (leftByte != rightByte) {
+          return 0;
+        }
+        lPos++;
+        rPos++;
+      }
+      return 1;
+    } else {
+      return 0;
+    }
+  }
+
+  /**
+   * Helper function to compare a set of bytes in two DrillBuffers.
+   *
+   * Function will check data before completing in the case that
+   *
+   * @param left Left DrillBuf to compare
+   * @param lStart start offset in the buffer
+   * @param lEnd end offset in the buffer
+   * @param right Right DrillBuf to compare
+   * @param rStart start offset in the buffer
+   * @param rEnd end offset in the buffer
+   * @return 1 if left input is greater, -1 if left input is smaller, 0 otherwise
+   */
+  public static final int compare(final ArrowBuf left, int lStart, int lEnd, final ArrowBuf right, int rStart, int rEnd){
+    if (BoundsChecking.BOUNDS_CHECKING_ENABLED) {
+      left.checkBytes(lStart, lEnd);
+      right.checkBytes(rStart, rEnd);
+    }
+    return memcmp(left.memoryAddress(), lStart, lEnd, right.memoryAddress(), rStart, rEnd);
+  }
+
+  private static final int memcmp(final long laddr, int lStart, int lEnd, final long raddr, int rStart, final int rEnd) {
+    int lLen = lEnd - lStart;
+    int rLen = rEnd - rStart;
+    int n = Math.min(rLen, lLen);
+    long lPos = laddr + lStart;
+    long rPos = raddr + rStart;
+
+    while (n > 7) {
+      long leftLong = PlatformDependent.getLong(lPos);
+      long rightLong = PlatformDependent.getLong(rPos);
+      if (leftLong != rightLong) {
+        return UnsignedLongs.compare(Long.reverseBytes(leftLong), Long.reverseBytes(rightLong));
+      }
+      lPos += 8;
+      rPos += 8;
+      n -= 8;
+    }
+
+    while (n-- != 0) {
+      byte leftByte = PlatformDependent.getByte(lPos);
+      byte rightByte = PlatformDependent.getByte(rPos);
+      if (leftByte != rightByte) {
+        return ((leftByte & 0xFF) - (rightByte & 0xFF)) > 0 ? 1 : -1;
+      }
+      lPos++;
+      rPos++;
+    }
+
+    if (lLen == rLen) {
+      return 0;
+    }
+
+    return lLen > rLen ? 1 : -1;
+
+  }
+
+  /**
+   * Helper function to compare a set of bytes in DrillBuf to a ByteArray.
+   *
+   * @param left Left DrillBuf for comparison purposes
+   * @param lStart start offset in the buffer
+   * @param lEnd end offset in the buffer
+   * @param right second input to be compared
+   * @param rStart start offset in the byte array
+   * @param rEnd end offset in the byte array
+   * @return 1 if left input is greater, -1 if left input is smaller, 0 otherwise
+   */
+  public static final int compare(final ArrowBuf left, int lStart, int lEnd, final byte[] right, int rStart, final int rEnd) {
+    if (BoundsChecking.BOUNDS_CHECKING_ENABLED) {
+      left.checkBytes(lStart, lEnd);
+    }
+    return memcmp(left.memoryAddress(), lStart, lEnd, right, rStart, rEnd);
+  }
+
+
+  private static final int memcmp(final long laddr, int lStart, int lEnd, final byte[] right, int rStart, final int rEnd) {
+    int lLen = lEnd - lStart;
+    int rLen = rEnd - rStart;
+    int n = Math.min(rLen, lLen);
+    long lPos = laddr + lStart;
+    int rPos = rStart;
+
+    while (n-- != 0) {
+      byte leftByte = PlatformDependent.getByte(lPos);
+      byte rightByte = right[rPos];
+      if (leftByte != rightByte) {
+        return ((leftByte & 0xFF) - (rightByte & 0xFF)) > 0 ? 1 : -1;
+      }
+      lPos++;
+      rPos++;
+    }
+
+    if (lLen == rLen) {
+      return 0;
+    }
+
+    return lLen > rLen ? 1 : -1;
+  }
+
+  /*
+   * Following are helper functions to interact with sparse decimal represented in a byte array.
+   */
+
+  // Get the integer ignore the sign
+  public static int getInteger(byte[] b, int index) {
+    return getInteger(b, index, true);
+  }
+  // Get the integer, ignore the sign
+  public static int getInteger(byte[] b, int index, boolean ignoreSign) {
+    int startIndex = index * DecimalUtility.INTEGER_SIZE;
+
+    if (index == 0 && ignoreSign == true) {
+      return (b[startIndex + 3] & 0xFF) |
+             (b[startIndex + 2] & 0xFF) << 8 |
+             (b[startIndex + 1] & 0xFF) << 16 |
+             (b[startIndex] & 0x7F) << 24;
+    }
+
+    return ((b[startIndex + 3] & 0xFF) |
+        (b[startIndex + 2] & 0xFF) << 8 |
+        (b[startIndex + 1] & 0xFF) << 16 |
+        (b[startIndex] & 0xFF) << 24);
+
+  }
+
+  // Set integer in the byte array
+  public static void setInteger(byte[] b, int index, int value) {
+    int startIndex = index * DecimalUtility.INTEGER_SIZE;
+    b[startIndex] = (byte) ((value >> 24) & 0xFF);
+    b[startIndex + 1] = (byte) ((value >> 16) & 0xFF);
+    b[startIndex + 2] = (byte) ((value >> 8) & 0xFF);
+    b[startIndex + 3] = (byte) ((value) & 0xFF);
+  }
+
+  // Set the sign in a sparse decimal representation
+  public static void setSign(byte[] b, boolean sign) {
+    int value = getInteger(b, 0);
+    if (sign == true) {
+      setInteger(b, 0, value | 0x80000000);
+    } else {
+      setInteger(b, 0, value & 0x7FFFFFFF);
+    }
+  }
+
+  // Get the sign
+  public static boolean getSign(byte[] b) {
+    return ((getInteger(b, 0, false) & 0x80000000) != 0);
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/util/CallBack.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/util/CallBack.java b/java/vector/src/main/java/org/apache/arrow/vector/util/CallBack.java
new file mode 100644
index 0000000..2498342
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/util/CallBack.java
@@ -0,0 +1,23 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.util;
+
+
+public interface CallBack {
+  public void doWork();
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/util/CoreDecimalUtility.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/util/CoreDecimalUtility.java b/java/vector/src/main/java/org/apache/arrow/vector/util/CoreDecimalUtility.java
new file mode 100644
index 0000000..1eb2c13
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/util/CoreDecimalUtility.java
@@ -0,0 +1,91 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.util;
+
+import java.math.BigDecimal;
+
+import org.apache.arrow.vector.types.Types;
+
+public class CoreDecimalUtility {
+  static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(CoreDecimalUtility.class);
+
+  public static long getDecimal18FromBigDecimal(BigDecimal input, int scale, int precision) {
+    // Truncate or pad to set the input to the correct scale
+    input = input.setScale(scale, BigDecimal.ROUND_HALF_UP);
+
+    return (input.unscaledValue().longValue());
+  }
+
+  public static int getMaxPrecision(Types.MinorType decimalType) {
+    if (decimalType == Types.MinorType.DECIMAL9) {
+      return 9;
+    } else if (decimalType == Types.MinorType.DECIMAL18) {
+      return 18;
+    } else if (decimalType == Types.MinorType.DECIMAL28SPARSE) {
+      return 28;
+    } else if (decimalType == Types.MinorType.DECIMAL38SPARSE) {
+      return 38;
+    }
+    return 0;
+  }
+
+  /*
+   * Function returns the Minor decimal type given the precision
+   */
+  public static Types.MinorType getDecimalDataType(int precision) {
+    if (precision <= 9) {
+      return Types.MinorType.DECIMAL9;
+    } else if (precision <= 18) {
+      return Types.MinorType.DECIMAL18;
+    } else if (precision <= 28) {
+      return Types.MinorType.DECIMAL28SPARSE;
+    } else {
+      return Types.MinorType.DECIMAL38SPARSE;
+    }
+  }
+
+  /*
+   * Given a precision it provides the max precision of that decimal data type;
+   * For eg: given the precision 12, we would use DECIMAL18 to store the data
+   * which has a max precision range of 18 digits
+   */
+  public static int getPrecisionRange(int precision) {
+    return getMaxPrecision(getDecimalDataType(precision));
+  }
+  public static int getDecimal9FromBigDecimal(BigDecimal input, int scale, int precision) {
+    // Truncate/ or pad to set the input to the correct scale
+    input = input.setScale(scale, BigDecimal.ROUND_HALF_UP);
+
+    return (input.unscaledValue().intValue());
+  }
+
+  /*
+   * Helper function to detect if the given data type is Decimal
+   */
+  public static boolean isDecimalType(Types.MajorType type) {
+    return isDecimalType(type.getMinorType());
+  }
+
+  public static boolean isDecimalType(Types.MinorType minorType) {
+    if (minorType == Types.MinorType.DECIMAL9 || minorType == Types.MinorType.DECIMAL18 ||
+        minorType == Types.MinorType.DECIMAL28SPARSE || minorType == Types.MinorType.DECIMAL38SPARSE) {
+      return true;
+    }
+    return false;
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/util/DateUtility.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/util/DateUtility.java b/java/vector/src/main/java/org/apache/arrow/vector/util/DateUtility.java
new file mode 100644
index 0000000..f4fc173
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/util/DateUtility.java
@@ -0,0 +1,682 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.vector.util;
+
+import org.joda.time.Period;
+import org.joda.time.format.DateTimeFormat;
+import org.joda.time.format.DateTimeFormatter;
+import org.joda.time.format.DateTimeFormatterBuilder;
+import org.joda.time.format.DateTimeParser;
+
+import com.carrotsearch.hppc.ObjectIntHashMap;
+
+// Utility class for Date, DateTime, TimeStamp, Interval data types
+public class DateUtility {
+
+
+    /* We have a hashmap that stores the timezone as the key and an index as the value
+     * While storing the timezone in value vectors, holders we only use this index. As we
+     * reconstruct the timestamp, we use this index to index through the array timezoneList
+     * and get the corresponding timezone and pass it to joda-time
+     */
+  public static ObjectIntHashMap<String> timezoneMap = new ObjectIntHashMap<String>();
+
+    public static String[] timezoneList =  {"Africa/Abidjan",
+                                            "Africa/Accra",
+                                            "Africa/Addis_Ababa",
+                                            "Africa/Algiers",
+                                            "Africa/Asmara",
+                                            "Africa/Asmera",
+                                            "Africa/Bamako",
+                                            "Africa/Bangui",
+                                            "Africa/Banjul",
+                                            "Africa/Bissau",
+                                            "Africa/Blantyre",
+                                            "Africa/Brazzaville",
+                                            "Africa/Bujumbura",
+                                            "Africa/Cairo",
+                                            "Africa/Casablanca",
+                                            "Africa/Ceuta",
+                                            "Africa/Conakry",
+                                            "Africa/Dakar",
+                                            "Africa/Dar_es_Salaam",
+                                            "Africa/Djibouti",
+                                            "Africa/Douala",
+                                            "Africa/El_Aaiun",
+                                            "Africa/Freetown",
+                                            "Africa/Gaborone",
+                                            "Africa/Harare",
+                                            "Africa/Johannesburg",
+                                            "Africa/Juba",
+                                            "Africa/Kampala",
+                                            "Africa/Khartoum",
+                                            "Africa/Kigali",
+                                            "Africa/Kinshasa",
+                                            "Africa/Lagos",
+                                            "Africa/Libreville",
+                                            "Africa/Lome",
+                                            "Africa/Luanda",
+                                            "Africa/Lubumbashi",
+                                            "Africa/Lusaka",
+                                            "Africa/Malabo",
+                                            "Africa/Maputo",
+                                            "Africa/Maseru",
+                                            "Africa/Mbabane",
+                                            "Africa/Mogadishu",
+                                            "Africa/Monrovia",
+                                            "Africa/Nairobi",
+                                            "Africa/Ndjamena",
+                                            "Africa/Niamey",
+                                            "Africa/Nouakchott",
+                                            "Africa/Ouagadougou",
+                                            "Africa/Porto-Novo",
+                                            "Africa/Sao_Tome",
+                                            "Africa/Timbuktu",
+                                            "Africa/Tripoli",
+                                            "Africa/Tunis",
+                                            "Africa/Windhoek",
+                                            "America/Adak",
+                                            "America/Anchorage",
+                                            "America/Anguilla",
+                                            "America/Antigua",
+                                            "America/Araguaina",
+                                            "America/Argentina/Buenos_Aires",
+                                            "America/Argentina/Catamarca",
+                                            "America/Argentina/ComodRivadavia",
+                                            "America/Argentina/Cordoba",
+                                            "America/Argentina/Jujuy",
+                                            "America/Argentina/La_Rioja",
+                                            "America/Argentina/Mendoza",
+                                            "America/Argentina/Rio_Gallegos",
+                                            "America/Argentina/Salta",
+                                            "America/Argentina/San_Juan",
+                                            "America/Argentina/San_Luis",
+                                            "America/Argentina/Tucuman",
+                                            "America/Argentina/Ushuaia",
+                                            "America/Aruba",
+                                            "America/Asuncion",
+                                            "America/Atikokan",
+                                            "America/Atka",
+                                            "America/Bahia",
+                                            "America/Bahia_Banderas",
+                                            "America/Barbados",
+                                            "America/Belem",
+                                            "America/Belize",
+                                            "America/Blanc-Sablon",
+                                            "America/Boa_Vista",
+                                            "America/Bogota",
+                                            "America/Boise",
+                                            "America/Buenos_Aires",
+                                            "America/Cambridge_Bay",
+                                            "America/Campo_Grande",
+                                            "America/Cancun",
+                                            "America/Caracas",
+                                            "America/Catamarca",
+                                            "America/Cayenne",
+                                            "America/Cayman",
+                                            "America/Chicago",
+                                            "America/Chihuahua",
+                                            "America/Coral_Harbour",
+                                            "America/Cordoba",
+                                            "America/Costa_Rica",
+                                            "America/Cuiaba",
+                                            "America/Curacao",
+                                            "America/Danmarkshavn",
+                                            "America/Dawson",
+                                            "America/Dawson_Creek",
+                                            "America/Denver",
+                                            "America/Detroit",
+                                            "America/Dominica",
+                                            "America/Edmonton",
+                                            "America/Eirunepe",
+                                            "America/El_Salvador",
+                                            "America/Ensenada",
+                                            "America/Fort_Wayne",
+                                            "America/Fortaleza",
+                                            "America/Glace_Bay",
+                                            "America/Godthab",
+                                            "America/Goose_Bay",
+                                            "America/Grand_Turk",
+                                            "America/Grenada",
+                                            "America/Guadeloupe",
+                                            "America/Guatemala",
+                                            "America/Guayaquil",
+                                            "America/Guyana",
+                                            "America/Halifax",
+                                            "America/Havana",
+                                            "America/Hermosillo",
+                                            "America/Indiana/Indianapolis",
+                                            "America/Indiana/Knox",
+                                            "America/Indiana/Marengo",
+                                            "America/Indiana/Petersburg",
+                                            "America/Indiana/Tell_City",
+                                            "America/Indiana/Vevay",
+                                            "America/Indiana/Vincennes",
+                                            "America/Indiana/Winamac",
+                                            "America/Indianapolis",
+                                            "America/Inuvik",
+                                            "America/Iqaluit",
+                                            "America/Jamaica",
+                                            "America/Jujuy",
+                                            "America/Juneau",
+                                            "America/Kentucky/Louisville",
+                                            "America/Kentucky/Monticello",
+                                            "America/Knox_IN",
+                                            "America/Kralendijk",
+                                            "America/La_Paz",
+                                            "America/Lima",
+                                            "America/Los_Angeles",
+                                            "America/Louisville",
+                                            "America/Lower_Princes",
+                                            "America/Maceio",
+                                            "America/Managua",
+                                            "America/Manaus",
+                                            "America/Marigot",
+                                            "America/Martinique",
+                                            "America/Matamoros",
+                                            "America/Mazatlan",
+                                            "America/Mendoza",
+                                            "America/Menominee",
+                                            "America/Merida",
+                                            "America/Metlakatla",
+                                            "America/Mexico_City",
+                                            "America/Miquelon",
+                                            "America/Moncton",
+                                            "America/Monterrey",
+                                            "America/Montevideo",
+                                            "America/Montreal",
+                                            "America/Montserrat",
+                                            "America/Nassau",
+                                            "America/New_York",
+                                            "America/Nipigon",
+                                            "America/Nome",
+                                            "America/Noronha",
+                                            "America/North_Dakota/Beulah",
+                                            "America/North_Dakota/Center",
+                                            "America/North_Dakota/New_Salem",
+                                            "America/Ojinaga",
+                                            "America/Panama",
+                                            "America/Pangnirtung",
+                                            "America/Paramaribo",
+                                            "America/Phoenix",
+                                            "America/Port-au-Prince",
+                                            "America/Port_of_Spain",
+                                            "America/Porto_Acre",
+                                            "America/Porto_Velho",
+                                            "America/Puerto_Rico",
+                                            "America/Rainy_River",
+                                            "America/Rankin_Inlet",
+                                            "America/Recife",
+                                            "America/Regina",
+                                            "America/Resolute",
+                                            "America/Rio_Branco",
+                                            "America/Rosario",
+                                            "America/Santa_Isabel",
+                                            "America/Santarem",
+                                            "America/Santiago",
+                                            "America/Santo_Domingo",
+                                            "America/Sao_Paulo",
+                                            "America/Scoresbysund",
+                                            "America/Shiprock",
+                                            "America/Sitka",
+                                            "America/St_Barthelemy",
+                                            "America/St_Johns",
+                                            "America/St_Kitts",
+                                            "America/St_Lucia",
+                                            "America/St_Thomas",
+                                            "America/St_Vincent",
+                                            "America/Swift_Current",
+                                            "America/Tegucigalpa",
+                                            "America/Thule",
+                                            "America/Thunder_Bay",
+                                            "America/Tijuana",
+                                            "America/Toronto",
+                                            "America/Tortola",
+                                            "America/Vancouver",
+                                            "America/Virgin",
+                                            "America/Whitehorse",
+                                            "America/Winnipeg",
+                                            "America/Yakutat",
+                                            "America/Yellowknife",
+                                            "Antarctica/Casey",
+                                            "Antarctica/Davis",
+                                            "Antarctica/DumontDUrville",
+                                            "Antarctica/Macquarie",
+                                            "Antarctica/Mawson",
+                                            "Antarctica/McMurdo",
+                                            "Antarctica/Palmer",
+                                            "Antarctica/Rothera",
+                                            "Antarctica/South_Pole",
+                                            "Antarctica/Syowa",
+                                            "Antarctica/Vostok",
+                                            "Arctic/Longyearbyen",
+                                            "Asia/Aden",
+                                            "Asia/Almaty",
+                                            "Asia/Amman",
+                                            "Asia/Anadyr",
+                                            "Asia/Aqtau",
+                                            "Asia/Aqtobe",
+                                            "Asia/Ashgabat",
+                                            "Asia/Ashkhabad",
+                                            "Asia/Baghdad",
+                                            "Asia/Bahrain",
+                                            "Asia/Baku",
+                                            "Asia/Bangkok",
+                                            "Asia/Beirut",
+                                            "Asia/Bishkek",
+                                            "Asia/Brunei",
+                                            "Asia/Calcutta",
+                                            "Asia/Choibalsan",
+                                            "Asia/Chongqing",
+                                            "Asia/Chungking",
+                                            "Asia/Colombo",
+                                            "Asia/Dacca",
+                                            "Asia/Damascus",
+                                            "Asia/Dhaka",
+                                            "Asia/Dili",
+                                            "Asia/Dubai",
+                                            "Asia/Dushanbe",
+                                            "Asia/Gaza",
+                                            "Asia/Harbin",
+                                            "Asia/Hebron",
+                                            "Asia/Ho_Chi_Minh",
+                                            "Asia/Hong_Kong",
+                                            "Asia/Hovd",
+                                            "Asia/Irkutsk",
+                                            "Asia/Istanbul",
+                                            "Asia/Jakarta",
+                                            "Asia/Jayapura",
+                                            "Asia/Jerusalem",
+                                            "Asia/Kabul",
+                                            "Asia/Kamchatka",
+                                            "Asia/Karachi",
+                                            "Asia/Kashgar",
+                                            "Asia/Kathmandu",
+                                            "Asia/Katmandu",
+                                            "Asia/Kolkata",
+                                            "Asia/Krasnoyarsk",
+                                            "Asia/Kuala_Lumpur",
+                                            "Asia/Kuching",
+                                            "Asia/Kuwait",
+                                            "Asia/Macao",
+                                            "Asia/Macau",
+                                            "Asia/Magadan",
+                                            "Asia/Makassar",
+                                            "Asia/Manila",
+                                            "Asia/Muscat",
+                                            "Asia/Nicosia",
+                                            "Asia/Novokuznetsk",
+                                            "Asia/Novosibirsk",
+                                            "Asia/Omsk",
+                                            "Asia/Oral",
+                                            "Asia/Phnom_Penh",
+                                            "Asia/Pontianak",
+                                            "Asia/Pyongyang",
+                                            "Asia/Qatar",
+                                            "Asia/Qyzylorda",
+                                            "Asia/Rangoon",
+                                            "Asia/Riyadh",
+                                            "Asia/Saigon",
+                                            "Asia/Sakhalin",
+                                            "Asia/Samarkand",
+                                            "Asia/Seoul",
+                                            "Asia/Shanghai",
+                                            "Asia/Singapore",
+                                            "Asia/Taipei",
+                                            "Asia/Tashkent",
+                                            "Asia/Tbilisi",
+                                            "Asia/Tehran",
+                                            "Asia/Tel_Aviv",
+                                            "Asia/Thimbu",
+                                            "Asia/Thimphu",
+                                            "Asia/Tokyo",
+                                            "Asia/Ujung_Pandang",
+                                            "Asia/Ulaanbaatar",
+                                            "Asia/Ulan_Bator",
+                                            "Asia/Urumqi",
+                                            "Asia/Vientiane",
+                                            "Asia/Vladivostok",
+                                            "Asia/Yakutsk",
+                                            "Asia/Yekaterinburg",
+                                            "Asia/Yerevan",
+                                            "Atlantic/Azores",
+                                            "Atlantic/Bermuda",
+                                            "Atlantic/Canary",
+                                            "Atlantic/Cape_Verde",
+                                            "Atlantic/Faeroe",
+                                            "Atlantic/Faroe",
+                                            "Atlantic/Jan_Mayen",
+                                            "Atlantic/Madeira",
+                                            "Atlantic/Reykjavik",
+                                            "Atlantic/South_Georgia",
+                                            "Atlantic/St_Helena",
+                                            "Atlantic/Stanley",
+                                            "Australia/ACT",
+                                            "Australia/Adelaide",
+                                            "Australia/Brisbane",
+                                            "Australia/Broken_Hill",
+                                            "Australia/Canberra",
+                                            "Australia/Currie",
+                                            "Australia/Darwin",
+                                            "Australia/Eucla",
+                                            "Australia/Hobart",
+                                            "Australia/LHI",
+                                            "Australia/Lindeman",
+                                            "Australia/Lord_Howe",
+                                            "Australia/Melbourne",
+                                            "Australia/NSW",
+                                            "Australia/North",
+                                            "Australia/Perth",
+                                            "Australia/Queensland",
+                                            "Australia/South",
+                                            "Australia/Sydney",
+                                            "Australia/Tasmania",
+                                            "Australia/Victoria",
+                                            "Australia/West",
+                                            "Australia/Yancowinna",
+                                            "Brazil/Acre",
+                                            "Brazil/DeNoronha",
+                                            "Brazil/East",
+                                            "Brazil/West",
+                                            "CET",
+                                            "CST6CDT",
+                                            "Canada/Atlantic",
+                                            "Canada/Central",
+                                            "Canada/East-Saskatchewan",
+                                            "Canada/Eastern",
+                                            "Canada/Mountain",
+                                            "Canada/Newfoundland",
+                                            "Canada/Pacific",
+                                            "Canada/Saskatchewan",
+                                            "Canada/Yukon",
+                                            "Chile/Continental",
+                                            "Chile/EasterIsland",
+                                            "Cuba",
+                                            "EET",
+                                            "EST",
+                                            "EST5EDT",
+                                            "Egypt",
+                                            "Eire",
+                                            "Etc/GMT",
+                                            "Etc/GMT+0",
+                                            "Etc/GMT+1",
+                                            "Etc/GMT+10",
+                                            "Etc/GMT+11",
+                                            "Etc/GMT+12",
+                                            "Etc/GMT+2",
+                                            "Etc/GMT+3",
+                                            "Etc/GMT+4",
+                                            "Etc/GMT+5",
+                                            "Etc/GMT+6",
+                                            "Etc/GMT+7",
+                                            "Etc/GMT+8",
+                                            "Etc/GMT+9",
+                                            "Etc/GMT-0",
+                                            "Etc/GMT-1",
+                                            "Etc/GMT-10",
+                                            "Etc/GMT-11",
+                                            "Etc/GMT-12",
+                                            "Etc/GMT-13",
+                                            "Etc/GMT-14",
+                                            "Etc/GMT-2",
+                                            "Etc/GMT-3",
+                                            "Etc/GMT-4",
+                                            "Etc/GMT-5",
+                                            "Etc/GMT-6",
+                                            "Etc/GMT-7",
+                                            "Etc/GMT-8",
+                                            "Etc/GMT-9",
+                                            "Etc/GMT0",
+                                            "Etc/Greenwich",
+                                            "Etc/UCT",
+                                            "Etc/UTC",
+                                            "Etc/Universal",
+                                            "Etc/Zulu",
+                                            "Europe/Amsterdam",
+                                            "Europe/Andorra",
+                                            "Europe/Athens",
+                                            "Europe/Belfast",
+                                            "Europe/Belgrade",
+                                            "Europe/Berlin",
+                                            "Europe/Bratislava",
+                                            "Europe/Brussels",
+                                            "Europe/Bucharest",
+                                            "Europe/Budapest",
+                                            "Europe/Chisinau",
+                                            "Europe/Copenhagen",
+                                            "Europe/Dublin",
+                                            "Europe/Gibraltar",
+                                            "Europe/Guernsey",
+                                            "Europe/Helsinki",
+                                            "Europe/Isle_of_Man",
+                                            "Europe/Istanbul",
+                                            "Europe/Jersey",
+                                            "Europe/Kaliningrad",
+                                            "Europe/Kiev",
+                                            "Europe/Lisbon",
+                                            "Europe/Ljubljana",
+                                            "Europe/London",
+                                            "Europe/Luxembourg",
+                                            "Europe/Madrid",
+                                            "Europe/Malta",
+                                            "Europe/Mariehamn",
+                                            "Europe/Minsk",
+                                            "Europe/Monaco",
+                                            "Europe/Moscow",
+                                            "Europe/Nicosia",
+                                            "Europe/Oslo",
+                                            "Europe/Paris",
+                                            "Europe/Podgorica",
+                                            "Europe/Prague",
+                                            "Europe/Riga",
+                                            "Europe/Rome",
+                                            "Europe/Samara",
+                                            "Europe/San_Marino",
+                                            "Europe/Sarajevo",
+                                            "Europe/Simferopol",
+                                            "Europe/Skopje",
+                                            "Europe/Sofia",
+                                            "Europe/Stockholm",
+                                            "Europe/Tallinn",
+                                            "Europe/Tirane",
+                                            "Europe/Tiraspol",
+                                            "Europe/Uzhgorod",
+                                            "Europe/Vaduz",
+                                            "Europe/Vatican",
+                                            "Europe/Vienna",
+                                            "Europe/Vilnius",
+                                            "Europe/Volgograd",
+                                            "Europe/Warsaw",
+                                            "Europe/Zagreb",
+                                            "Europe/Zaporozhye",
+                                            "Europe/Zurich",
+                                            "GB",
+                                            "GB-Eire",
+                                            "GMT",
+                                            "GMT+0",
+                                            "GMT-0",
+                                            "GMT0",
+                                            "Greenwich",
+                                            "HST",
+                                            "Hongkong",
+                                            "Iceland",
+                                            "Indian/Antananarivo",
+                                            "Indian/Chagos",
+                                            "Indian/Christmas",
+                                            "Indian/Cocos",
+                                            "Indian/Comoro",
+                                            "Indian/Kerguelen",
+                                            "Indian/Mahe",
+                                            "Indian/Maldives",
+                                            "Indian/Mauritius",
+                                            "Indian/Mayotte",
+                                            "Indian/Reunion",
+                                            "Iran",
+                                            "Israel",
+                                            "Jamaica",
+                                            "Japan",
+                                            "Kwajalein",
+                                            "Libya",
+                                            "MET",
+                                            "MST",
+                                            "MST7MDT",
+                                            "Mexico/BajaNorte",
+                                            "Mexico/BajaSur",
+                                            "Mexico/General",
+                                            "NZ",
+                                            "NZ-CHAT",
+                                            "Navajo",
+                                            "PRC",
+                                            "PST8PDT",
+                                            "Pacific/Apia",
+                                            "Pacific/Auckland",
+                                            "Pacific/Chatham",
+                                            "Pacific/Chuuk",
+                                            "Pacific/Easter",
+                                            "Pacific/Efate",
+                                            "Pacific/Enderbury",
+                                            "Pacific/Fakaofo",
+                                            "Pacific/Fiji",
+                                            "Pacific/Funafuti",
+                                            "Pacific/Galapagos",
+                                            "Pacific/Gambier",
+                                            "Pacific/Guadalcanal",
+                                            "Pacific/Guam",
+                                            "Pacific/Honolulu",
+                                            "Pacific/Johnston",
+                                            "Pacific/Kiritimati",
+                                            "Pacific/Kosrae",
+                                            "Pacific/Kwajalein",
+                                            "Pacific/Majuro",
+                                            "Pacific/Marquesas",
+                                            "Pacific/Midway",
+                                            "Pacific/Nauru",
+                                            "Pacific/Niue",
+                                            "Pacific/Norfolk",
+                                            "Pacific/Noumea",
+                                            "Pacific/Pago_Pago",
+                                            "Pacific/Palau",
+                                            "Pacific/Pitcairn",
+                                            "Pacific/Pohnpei",
+                                            "Pacific/Ponape",
+                                            "Pacific/Port_Moresby",
+                                            "Pacific/Rarotonga",
+                                            "Pacific/Saipan",
+                                            "Pacific/Samoa",
+                                            "Pacific/Tahiti",
+                                            "Pacific/Tarawa",
+                                            "Pacific/Tongatapu",
+                                            "Pacific/Truk",
+                                            "Pacific/Wake",
+                                            "Pacific/Wallis",
+                                            "Pacific/Yap",
+                                            "Poland",
+                                            "Portugal",
+                                            "ROC",
+                                            "ROK",
+                                            "Singapore",
+                                            "Turkey",
+                                            "UCT",
+                                            "US/Alaska",
+                                            "US/Aleutian",
+                                            "US/Arizona",
+                                            "US/Central",
+                                            "US/East-Indiana",
+                                            "US/Eastern",
+                                            "US/Hawaii",
+                                            "US/Indiana-Starke",
+                                            "US/Michigan",
+                                            "US/Mountain",
+                                            "US/Pacific",
+                                            "US/Pacific-New",
+                                            "US/Samoa",
+                                            "UTC",
+                                            "Universal",
+                                            "W-SU",
+                                            "WET",
+                                            "Zulu"};
+
+    static {
+      for (int i = 0; i < timezoneList.length; i++) {
+        timezoneMap.put(timezoneList[i], i);
+      }
+    }
+
+    public static final DateTimeFormatter formatDate        = DateTimeFormat.forPattern("yyyy-MM-dd");
+    public static final DateTimeFormatter formatTimeStamp    = DateTimeFormat.forPattern("yyyy-MM-dd HH:mm:ss.SSS");
+    public static final DateTimeFormatter formatTimeStampTZ = DateTimeFormat.forPattern("yyyy-MM-dd HH:mm:ss.SSS ZZZ");
+    public static final DateTimeFormatter formatTime        = DateTimeFormat.forPattern("HH:mm:ss.SSS");
+
+    public static DateTimeFormatter dateTimeTZFormat = null;
+    public static DateTimeFormatter timeFormat = null;
+
+    public static final int yearsToMonths = 12;
+    public static final int hoursToMillis = 60 * 60 * 1000;
+    public static final int minutesToMillis = 60 * 1000;
+    public static final int secondsToMillis = 1000;
+    public static final int monthToStandardDays = 30;
+    public static final long monthsToMillis = 2592000000L; // 30 * 24 * 60 * 60 * 1000
+    public static final int daysToStandardMillis = 24 * 60 * 60 * 1000;
+
+
+  public static int getIndex(String timezone) {
+        return timezoneMap.get(timezone);
+    }
+
+    public static String getTimeZone(int index) {
+        return timezoneList[index];
+    }
+
+    // Function returns the date time formatter used to parse date strings
+    public static DateTimeFormatter getDateTimeFormatter() {
+
+        if (dateTimeTZFormat == null) {
+            DateTimeFormatter dateFormatter = DateTimeFormat.forPattern("yyyy-MM-dd");
+            DateTimeParser optionalTime = DateTimeFormat.forPattern(" HH:mm:ss").getParser();
+            DateTimeParser optionalSec = DateTimeFormat.forPattern(".SSS").getParser();
+            DateTimeParser optionalZone = DateTimeFormat.forPattern(" ZZZ").getParser();
+
+            dateTimeTZFormat = new DateTimeFormatterBuilder().append(dateFormatter).appendOptional(optionalTime).appendOptional(optionalSec).appendOptional(optionalZone).toFormatter();
+        }
+
+        return dateTimeTZFormat;
+    }
+
+    // Function returns time formatter used to parse time strings
+    public static DateTimeFormatter getTimeFormatter() {
+        if (timeFormat == null) {
+            DateTimeFormatter timeFormatter = DateTimeFormat.forPattern("HH:mm:ss");
+            DateTimeParser optionalSec = DateTimeFormat.forPattern(".SSS").getParser();
+            timeFormat = new DateTimeFormatterBuilder().append(timeFormatter).appendOptional(optionalSec).toFormatter();
+        }
+        return timeFormat;
+    }
+
+    public static int monthsFromPeriod(Period period){
+      return (period.getYears() * yearsToMonths) + period.getMonths();
+    }
+
+    public static int millisFromPeriod(final Period period){
+      return (period.getHours() * hoursToMillis) +
+      (period.getMinutes() * minutesToMillis) +
+      (period.getSeconds() * secondsToMillis) +
+      (period.getMillis());
+    }
+
+}


[08/17] arrow git commit: ARROW-1: Initial Arrow Code Commit

Posted by ja...@apache.org.
http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/AbstractFieldWriter.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/AbstractFieldWriter.java b/java/vector/src/main/codegen/templates/AbstractFieldWriter.java
new file mode 100644
index 0000000..6ee9dad
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/AbstractFieldWriter.java
@@ -0,0 +1,147 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+<@pp.dropOutputFile />
+<@pp.changeOutputFile name="/org/apache/arrow/vector/complex/impl/AbstractFieldWriter.java" />
+
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.complex.impl;
+
+<#include "/@includes/vv_imports.ftl" />
+
+/*
+ * This class is generated using freemarker and the ${.template_name} template.
+ */
+@SuppressWarnings("unused")
+abstract class AbstractFieldWriter extends AbstractBaseWriter implements FieldWriter {
+  AbstractFieldWriter(FieldWriter parent) {
+    super(parent);
+  }
+
+  @Override
+  public void start() {
+    throw new IllegalStateException(String.format("You tried to start when you are using a ValueWriter of type %s.", this.getClass().getSimpleName()));
+  }
+
+  @Override
+  public void end() {
+    throw new IllegalStateException(String.format("You tried to end when you are using a ValueWriter of type %s.", this.getClass().getSimpleName()));
+  }
+
+  @Override
+  public void startList() {
+    throw new IllegalStateException(String.format("You tried to start when you are using a ValueWriter of type %s.", this.getClass().getSimpleName()));
+  }
+
+  @Override
+  public void endList() {
+    throw new IllegalStateException(String.format("You tried to end when you are using a ValueWriter of type %s.", this.getClass().getSimpleName()));
+  }
+
+  <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first />
+  <#assign fields = minor.fields!type.fields />
+  @Override
+  public void write(${name}Holder holder) {
+    fail("${name}");
+  }
+
+  public void write${minor.class}(<#list fields as field>${field.type} ${field.name}<#if field_has_next>, </#if></#list>) {
+    fail("${name}");
+  }
+
+  </#list></#list>
+
+  public void writeNull() {
+    fail("${name}");
+  }
+
+  /**
+   * This implementation returns {@code false}.
+   * <p>  
+   *   Must be overridden by map writers.
+   * </p>  
+   */
+  @Override
+  public boolean isEmptyMap() {
+    return false;
+  }
+
+  @Override
+  public MapWriter map() {
+    fail("Map");
+    return null;
+  }
+
+  @Override
+  public ListWriter list() {
+    fail("List");
+    return null;
+  }
+
+  @Override
+  public MapWriter map(String name) {
+    fail("Map");
+    return null;
+  }
+
+  @Override
+  public ListWriter list(String name) {
+    fail("List");
+    return null;
+  }
+
+  <#list vv.types as type><#list type.minor as minor>
+  <#assign lowerName = minor.class?uncap_first />
+  <#if lowerName == "int" ><#assign lowerName = "integer" /></#if>
+  <#assign upperName = minor.class?upper_case />
+  <#assign capName = minor.class?cap_first />
+  <#if minor.class?starts_with("Decimal") >
+  public ${capName}Writer ${lowerName}(String name, int scale, int precision) {
+    fail("${capName}");
+    return null;
+  }
+  </#if>
+
+  @Override
+  public ${capName}Writer ${lowerName}(String name) {
+    fail("${capName}");
+    return null;
+  }
+
+  @Override
+  public ${capName}Writer ${lowerName}() {
+    fail("${capName}");
+    return null;
+  }
+
+  </#list></#list>
+
+  public void copyReader(FieldReader reader) {
+    fail("Copy FieldReader");
+  }
+
+  public void copyReaderToField(String name, FieldReader reader) {
+    fail("Copy FieldReader to STring");
+  }
+
+  private void fail(String name) {
+    throw new IllegalArgumentException(String.format("You tried to write a %s type when you are using a ValueWriter of type %s.", name, this.getClass().getSimpleName()));
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/AbstractPromotableFieldWriter.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/AbstractPromotableFieldWriter.java b/java/vector/src/main/codegen/templates/AbstractPromotableFieldWriter.java
new file mode 100644
index 0000000..549dbf1
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/AbstractPromotableFieldWriter.java
@@ -0,0 +1,142 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+
+<@pp.dropOutputFile />
+<@pp.changeOutputFile name="/org/apache/arrow/vector/complex/impl/AbstractPromotableFieldWriter.java" />
+
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.complex.impl;
+
+<#include "/@includes/vv_imports.ftl" />
+
+/*
+ * A FieldWriter which delegates calls to another FieldWriter. The delegate FieldWriter can be promoted to a new type
+ * when necessary. Classes that extend this class are responsible for handling promotion.
+ *
+ * This class is generated using freemarker and the ${.template_name} template.
+ *
+ */
+@SuppressWarnings("unused")
+abstract class AbstractPromotableFieldWriter extends AbstractFieldWriter {
+  AbstractPromotableFieldWriter(FieldWriter parent) {
+    super(parent);
+  }
+
+  /**
+   * Retrieve the FieldWriter, promoting if it is not a FieldWriter of the specified type
+   * @param type
+   * @return
+   */
+  abstract protected FieldWriter getWriter(MinorType type);
+
+  /**
+   * Return the current FieldWriter
+   * @return
+   */
+  abstract protected FieldWriter getWriter();
+
+  @Override
+  public void start() {
+    getWriter(MinorType.MAP).start();
+  }
+
+  @Override
+  public void end() {
+    getWriter(MinorType.MAP).end();
+  }
+
+  @Override
+  public void startList() {
+    getWriter(MinorType.LIST).startList();
+  }
+
+  @Override
+  public void endList() {
+    getWriter(MinorType.LIST).endList();
+  }
+
+  <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first />
+  <#assign fields = minor.fields!type.fields />
+  <#if !minor.class?starts_with("Decimal") >
+  @Override
+  public void write(${name}Holder holder) {
+    getWriter(MinorType.${name?upper_case}).write(holder);
+  }
+
+  public void write${minor.class}(<#list fields as field>${field.type} ${field.name}<#if field_has_next>, </#if></#list>) {
+    getWriter(MinorType.${name?upper_case}).write${minor.class}(<#list fields as field>${field.name}<#if field_has_next>, </#if></#list>);
+  }
+
+  </#if>
+  </#list></#list>
+
+  public void writeNull() {
+  }
+
+  @Override
+  public MapWriter map() {
+    return getWriter(MinorType.LIST).map();
+  }
+
+  @Override
+  public ListWriter list() {
+    return getWriter(MinorType.LIST).list();
+  }
+
+  @Override
+  public MapWriter map(String name) {
+    return getWriter(MinorType.MAP).map(name);
+  }
+
+  @Override
+  public ListWriter list(String name) {
+    return getWriter(MinorType.MAP).list(name);
+  }
+
+  <#list vv.types as type><#list type.minor as minor>
+  <#assign lowerName = minor.class?uncap_first />
+  <#if lowerName == "int" ><#assign lowerName = "integer" /></#if>
+  <#assign upperName = minor.class?upper_case />
+  <#assign capName = minor.class?cap_first />
+  <#if !minor.class?starts_with("Decimal") >
+
+  @Override
+  public ${capName}Writer ${lowerName}(String name) {
+    return getWriter(MinorType.MAP).${lowerName}(name);
+  }
+
+  @Override
+  public ${capName}Writer ${lowerName}() {
+    return getWriter(MinorType.LIST).${lowerName}();
+  }
+
+  </#if>
+  </#list></#list>
+
+  public void copyReader(FieldReader reader) {
+    getWriter().copyReader(reader);
+  }
+
+  public void copyReaderToField(String name, FieldReader reader) {
+    getWriter().copyReaderToField(name, reader);
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/BaseReader.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/BaseReader.java b/java/vector/src/main/codegen/templates/BaseReader.java
new file mode 100644
index 0000000..8f12b1d
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/BaseReader.java
@@ -0,0 +1,73 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+<@pp.dropOutputFile />
+<@pp.changeOutputFile name="/org/apache/arrow/vector/complex/reader/BaseReader.java" />
+
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.complex.reader;
+
+<#include "/@includes/vv_imports.ftl" />
+
+
+
+@SuppressWarnings("unused")
+public interface BaseReader extends Positionable{
+  MajorType getType();
+  MaterializedField getField();
+  void reset();
+  void read(UnionHolder holder);
+  void read(int index, UnionHolder holder);
+  void copyAsValue(UnionWriter writer);
+  boolean isSet();
+
+  public interface MapReader extends BaseReader, Iterable<String>{
+    FieldReader reader(String name);
+  }
+  
+  public interface RepeatedMapReader extends MapReader{
+    boolean next();
+    int size();
+    void copyAsValue(MapWriter writer);
+  }
+  
+  public interface ListReader extends BaseReader{
+    FieldReader reader(); 
+  }
+  
+  public interface RepeatedListReader extends ListReader{
+    boolean next();
+    int size();
+    void copyAsValue(ListWriter writer);
+  }
+  
+  public interface ScalarReader extends  
+  <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first /> ${name}Reader, </#list></#list> 
+  <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first /> Repeated${name}Reader, </#list></#list>
+  BaseReader {}
+  
+  interface ComplexReader{
+    MapReader rootAsMap();
+    ListReader rootAsList();
+    boolean rootIsMap();
+    boolean ok();
+  }
+}
+

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/BaseWriter.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/BaseWriter.java b/java/vector/src/main/codegen/templates/BaseWriter.java
new file mode 100644
index 0000000..299b238
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/BaseWriter.java
@@ -0,0 +1,117 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+<@pp.dropOutputFile />
+<@pp.changeOutputFile name="/org/apache/arrow/vector/complex/writer/BaseWriter.java" />
+
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.complex.writer;
+
+<#include "/@includes/vv_imports.ftl" />
+
+/*
+ * File generated from ${.template_name} using FreeMarker.
+ */
+@SuppressWarnings("unused")
+  public interface BaseWriter extends AutoCloseable, Positionable {
+  FieldWriter getParent();
+  int getValueCapacity();
+
+  public interface MapWriter extends BaseWriter {
+
+    MaterializedField getField();
+
+    /**
+     * Whether this writer is a map writer and is empty (has no children).
+     * 
+     * <p>
+     *   Intended only for use in determining whether to add dummy vector to
+     *   avoid empty (zero-column) schema, as in JsonReader.
+     * </p>
+     * 
+     */
+    boolean isEmptyMap();
+
+    <#list vv.types as type><#list type.minor as minor>
+    <#assign lowerName = minor.class?uncap_first />
+    <#if lowerName == "int" ><#assign lowerName = "integer" /></#if>
+    <#assign upperName = minor.class?upper_case />
+    <#assign capName = minor.class?cap_first />
+    <#if minor.class?starts_with("Decimal") >
+    ${capName}Writer ${lowerName}(String name, int scale, int precision);
+    </#if>
+    ${capName}Writer ${lowerName}(String name);
+    </#list></#list>
+
+    void copyReaderToField(String name, FieldReader reader);
+    MapWriter map(String name);
+    ListWriter list(String name);
+    void start();
+    void end();
+  }
+
+  public interface ListWriter extends BaseWriter {
+    void startList();
+    void endList();
+    MapWriter map();
+    ListWriter list();
+    void copyReader(FieldReader reader);
+
+    <#list vv.types as type><#list type.minor as minor>
+    <#assign lowerName = minor.class?uncap_first />
+    <#if lowerName == "int" ><#assign lowerName = "integer" /></#if>
+    <#assign upperName = minor.class?upper_case />
+    <#assign capName = minor.class?cap_first />
+    ${capName}Writer ${lowerName}();
+    </#list></#list>
+  }
+
+  public interface ScalarWriter extends
+  <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first /> ${name}Writer, </#list></#list> BaseWriter {}
+
+  public interface ComplexWriter {
+    void allocate();
+    void clear();
+    void copyReader(FieldReader reader);
+    MapWriter rootAsMap();
+    ListWriter rootAsList();
+
+    void setPosition(int index);
+    void setValueCount(int count);
+    void reset();
+  }
+
+  public interface MapOrListWriter {
+    void start();
+    void end();
+    MapOrListWriter map(String name);
+    MapOrListWriter listoftmap(String name);
+    MapOrListWriter list(String name);
+    boolean isMapWriter();
+    boolean isListWriter();
+    VarCharWriter varChar(String name);
+    IntWriter integer(String name);
+    BigIntWriter bigInt(String name);
+    Float4Writer float4(String name);
+    Float8Writer float8(String name);
+    BitWriter bit(String name);
+    VarBinaryWriter binary(String name);
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/BasicTypeHelper.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/BasicTypeHelper.java b/java/vector/src/main/codegen/templates/BasicTypeHelper.java
new file mode 100644
index 0000000..bb6446e
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/BasicTypeHelper.java
@@ -0,0 +1,538 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+<@pp.dropOutputFile />
+<@pp.changeOutputFile name="/org/apache/arrow/vector/util/BasicTypeHelper.java" />
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.util;
+
+<#include "/@includes/vv_imports.ftl" />
+import org.apache.arrow.vector.complex.UnionVector;
+import org.apache.arrow.vector.complex.RepeatedMapVector;
+import org.apache.arrow.vector.util.CallBack;
+
+public class BasicTypeHelper {
+  static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(BasicTypeHelper.class);
+
+  private static final int WIDTH_ESTIMATE = 50;
+
+  // Default length when casting to varchar : 65536 = 2^16
+  // This only defines an absolute maximum for values, setting
+  // a high value like this will not inflate the size for small values
+  public static final int VARCHAR_DEFAULT_CAST_LEN = 65536;
+
+  protected static String buildErrorMessage(final String operation, final MinorType type, final DataMode mode) {
+    return String.format("Unable to %s for minor type [%s] and mode [%s]", operation, type, mode);
+  }
+
+  protected static String buildErrorMessage(final String operation, final MajorType type) {
+    return buildErrorMessage(operation, type.getMinorType(), type.getMode());
+  }
+
+  public static int getSize(MajorType major) {
+    switch (major.getMinorType()) {
+<#list vv.types as type>
+  <#list type.minor as minor>
+    case ${minor.class?upper_case}:
+      return ${type.width}<#if minor.class?substring(0, 3) == "Var" ||
+                               minor.class?substring(0, 3) == "PRO" ||
+                               minor.class?substring(0, 3) == "MSG"> + WIDTH_ESTIMATE</#if>;
+  </#list>
+</#list>
+//      case FIXEDCHAR: return major.getWidth();
+//      case FIXED16CHAR: return major.getWidth();
+//      case FIXEDBINARY: return major.getWidth();
+    }
+    throw new UnsupportedOperationException(buildErrorMessage("get size", major));
+  }
+
+  public static ValueVector getNewVector(String name, BufferAllocator allocator, MajorType type, CallBack callback){
+    MaterializedField field = MaterializedField.create(name, type);
+    return getNewVector(field, allocator, callback);
+  }
+  
+  
+  public static Class<?> getValueVectorClass(MinorType type, DataMode mode){
+    switch (type) {
+    case UNION:
+      return UnionVector.class;
+    case MAP:
+      switch (mode) {
+      case OPTIONAL:
+      case REQUIRED:
+        return MapVector.class;
+      case REPEATED:
+        return RepeatedMapVector.class;
+      }
+      
+    case LIST:
+      switch (mode) {
+      case REPEATED:
+        return RepeatedListVector.class;
+      case REQUIRED:
+      case OPTIONAL:
+        return ListVector.class;
+      }
+    
+<#list vv.types as type>
+  <#list type.minor as minor>
+      case ${minor.class?upper_case}:
+        switch (mode) {
+          case REQUIRED:
+            return ${minor.class}Vector.class;
+          case OPTIONAL:
+            return Nullable${minor.class}Vector.class;
+          case REPEATED:
+            return Repeated${minor.class}Vector.class;
+        }
+  </#list>
+</#list>
+    case GENERIC_OBJECT      :
+      return ObjectVector.class  ;
+    default:
+      break;
+    }
+    throw new UnsupportedOperationException(buildErrorMessage("get value vector class", type, mode));
+  }
+  public static Class<?> getReaderClassName( MinorType type, DataMode mode, boolean isSingularRepeated){
+    switch (type) {
+    case MAP:
+      switch (mode) {
+      case REQUIRED:
+        if (!isSingularRepeated)
+          return SingleMapReaderImpl.class;
+        else
+          return SingleLikeRepeatedMapReaderImpl.class;
+      case REPEATED: 
+          return RepeatedMapReaderImpl.class;
+      }
+    case LIST:
+      switch (mode) {
+      case REQUIRED:
+        return SingleListReaderImpl.class;
+      case REPEATED:
+        return RepeatedListReaderImpl.class;
+      }
+      
+<#list vv.types as type>
+  <#list type.minor as minor>
+      case ${minor.class?upper_case}:
+        switch (mode) {
+          case REQUIRED:
+            return ${minor.class}ReaderImpl.class;
+          case OPTIONAL:
+            return Nullable${minor.class}ReaderImpl.class;
+          case REPEATED:
+            return Repeated${minor.class}ReaderImpl.class;
+        }
+  </#list>
+</#list>
+      default:
+        break;
+      }
+      throw new UnsupportedOperationException(buildErrorMessage("get reader class name", type, mode));
+  }
+  
+  public static Class<?> getWriterInterface( MinorType type, DataMode mode){
+    switch (type) {
+    case UNION: return UnionWriter.class;
+    case MAP: return MapWriter.class;
+    case LIST: return ListWriter.class;
+<#list vv.types as type>
+  <#list type.minor as minor>
+      case ${minor.class?upper_case}: return ${minor.class}Writer.class;
+  </#list>
+</#list>
+      default:
+        break;
+      }
+      throw new UnsupportedOperationException(buildErrorMessage("get writer interface", type, mode));
+  }
+  
+  public static Class<?> getWriterImpl( MinorType type, DataMode mode){
+    switch (type) {
+    case UNION:
+      return UnionWriter.class;
+    case MAP:
+      switch (mode) {
+      case REQUIRED:
+      case OPTIONAL:
+        return SingleMapWriter.class;
+      case REPEATED:
+        return RepeatedMapWriter.class;
+      }
+    case LIST:
+      switch (mode) {
+      case REQUIRED:
+      case OPTIONAL:
+        return UnionListWriter.class;
+      case REPEATED:
+        return RepeatedListWriter.class;
+      }
+      
+<#list vv.types as type>
+  <#list type.minor as minor>
+      case ${minor.class?upper_case}:
+        switch (mode) {
+          case REQUIRED:
+            return ${minor.class}WriterImpl.class;
+          case OPTIONAL:
+            return Nullable${minor.class}WriterImpl.class;
+          case REPEATED:
+            return Repeated${minor.class}WriterImpl.class;
+        }
+  </#list>
+</#list>
+      default:
+        break;
+      }
+      throw new UnsupportedOperationException(buildErrorMessage("get writer implementation", type, mode));
+  }
+
+  public static Class<?> getHolderReaderImpl( MinorType type, DataMode mode){
+    switch (type) {      
+<#list vv.types as type>
+  <#list type.minor as minor>
+      case ${minor.class?upper_case}:
+        switch (mode) {
+          case REQUIRED:
+            return ${minor.class}HolderReaderImpl.class;
+          case OPTIONAL:
+            return Nullable${minor.class}HolderReaderImpl.class;
+          case REPEATED:
+            return Repeated${minor.class}HolderReaderImpl.class;
+        }
+  </#list>
+</#list>
+      default:
+        break;
+      }
+      throw new UnsupportedOperationException(buildErrorMessage("get holder reader implementation", type, mode));
+  }
+  
+  public static ValueVector getNewVector(MaterializedField field, BufferAllocator allocator){
+    return getNewVector(field, allocator, null);
+  }
+  public static ValueVector getNewVector(MaterializedField field, BufferAllocator allocator, CallBack callBack){
+    MajorType type = field.getType();
+
+    switch (type.getMinorType()) {
+    
+    case UNION:
+      return new UnionVector(field, allocator, callBack);
+
+    case MAP:
+      switch (type.getMode()) {
+      case REQUIRED:
+      case OPTIONAL:
+        return new MapVector(field, allocator, callBack);
+      case REPEATED:
+        return new RepeatedMapVector(field, allocator, callBack);
+      }
+    case LIST:
+      switch (type.getMode()) {
+      case REPEATED:
+        return new RepeatedListVector(field, allocator, callBack);
+      case OPTIONAL:
+      case REQUIRED:
+        return new ListVector(field, allocator, callBack);
+      }
+<#list vv.  types as type>
+  <#list type.minor as minor>
+    case ${minor.class?upper_case}:
+      switch (type.getMode()) {
+        case REQUIRED:
+          return new ${minor.class}Vector(field, allocator);
+        case OPTIONAL:
+          return new Nullable${minor.class}Vector(field, allocator);
+        case REPEATED:
+          return new Repeated${minor.class}Vector(field, allocator);
+      }
+  </#list>
+</#list>
+    case GENERIC_OBJECT:
+      return new ObjectVector(field, allocator)        ;
+    default:
+      break;
+    }
+    // All ValueVector types have been handled.
+    throw new UnsupportedOperationException(buildErrorMessage("get new vector", type));
+  }
+
+  public static ValueHolder getValue(ValueVector vector, int index) {
+    MajorType type = vector.getField().getType();
+    ValueHolder holder;
+    switch(type.getMinorType()) {
+<#list vv.types as type>
+  <#list type.minor as minor>
+    case ${minor.class?upper_case} :
+      <#if minor.class?starts_with("Var") || minor.class == "IntervalDay" || minor.class == "Interval" ||
+        minor.class?starts_with("Decimal28") ||  minor.class?starts_with("Decimal38")>
+        switch (type.getMode()) {
+          case REQUIRED:
+            holder = new ${minor.class}Holder();
+            ((${minor.class}Vector) vector).getAccessor().get(index, (${minor.class}Holder)holder);
+            return holder;
+          case OPTIONAL:
+            holder = new Nullable${minor.class}Holder();
+            ((Nullable${minor.class}Holder)holder).isSet = ((Nullable${minor.class}Vector) vector).getAccessor().isSet(index);
+            if (((Nullable${minor.class}Holder)holder).isSet == 1) {
+              ((Nullable${minor.class}Vector) vector).getAccessor().get(index, (Nullable${minor.class}Holder)holder);
+            }
+            return holder;
+        }
+      <#else>
+      switch (type.getMode()) {
+        case REQUIRED:
+          holder = new ${minor.class}Holder();
+          ((${minor.class}Holder)holder).value = ((${minor.class}Vector) vector).getAccessor().get(index);
+          return holder;
+        case OPTIONAL:
+          holder = new Nullable${minor.class}Holder();
+          ((Nullable${minor.class}Holder)holder).isSet = ((Nullable${minor.class}Vector) vector).getAccessor().isSet(index);
+          if (((Nullable${minor.class}Holder)holder).isSet == 1) {
+            ((Nullable${minor.class}Holder)holder).value = ((Nullable${minor.class}Vector) vector).getAccessor().get(index);
+          }
+          return holder;
+      }
+      </#if>
+  </#list>
+</#list>
+    case GENERIC_OBJECT:
+      holder = new ObjectHolder();
+      ((ObjectHolder)holder).obj = ((ObjectVector) vector).getAccessor().getObject(index)         ;
+      break;
+    }
+
+    throw new UnsupportedOperationException(buildErrorMessage("get value", type));
+  }
+
+  public static void setValue(ValueVector vector, int index, ValueHolder holder) {
+    MajorType type = vector.getField().getType();
+
+    switch(type.getMinorType()) {
+<#list vv.types as type>
+  <#list type.minor as minor>
+    case ${minor.class?upper_case} :
+      switch (type.getMode()) {
+        case REQUIRED:
+          ((${minor.class}Vector) vector).getMutator().setSafe(index, (${minor.class}Holder) holder);
+          return;
+        case OPTIONAL:
+          if (((Nullable${minor.class}Holder) holder).isSet == 1) {
+            ((Nullable${minor.class}Vector) vector).getMutator().setSafe(index, (Nullable${minor.class}Holder) holder);
+          }
+          return;
+      }
+  </#list>
+</#list>
+    case GENERIC_OBJECT:
+      ((ObjectVector) vector).getMutator().setSafe(index, (ObjectHolder) holder);
+      return;
+    default:
+      throw new UnsupportedOperationException(buildErrorMessage("set value", type));
+    }
+  }
+
+  public static void setValueSafe(ValueVector vector, int index, ValueHolder holder) {
+    MajorType type = vector.getField().getType();
+
+    switch(type.getMinorType()) {
+      <#list vv.types as type>
+      <#list type.minor as minor>
+      case ${minor.class?upper_case} :
+      switch (type.getMode()) {
+        case REQUIRED:
+          ((${minor.class}Vector) vector).getMutator().setSafe(index, (${minor.class}Holder) holder);
+          return;
+        case OPTIONAL:
+          if (((Nullable${minor.class}Holder) holder).isSet == 1) {
+            ((Nullable${minor.class}Vector) vector).getMutator().setSafe(index, (Nullable${minor.class}Holder) holder);
+          } else {
+            ((Nullable${minor.class}Vector) vector).getMutator().isSafe(index);
+          }
+          return;
+      }
+      </#list>
+      </#list>
+      case GENERIC_OBJECT:
+        ((ObjectVector) vector).getMutator().setSafe(index, (ObjectHolder) holder);
+      default:
+        throw new UnsupportedOperationException(buildErrorMessage("set value safe", type));
+    }
+  }
+
+  public static boolean compareValues(ValueVector v1, int v1index, ValueVector v2, int v2index) {
+    MajorType type1 = v1.getField().getType();
+    MajorType type2 = v2.getField().getType();
+
+    if (type1.getMinorType() != type2.getMinorType()) {
+      return false;
+    }
+
+    switch(type1.getMinorType()) {
+<#list vv.types as type>
+  <#list type.minor as minor>
+    case ${minor.class?upper_case} :
+      if ( ((${minor.class}Vector) v1).getAccessor().get(v1index) == 
+           ((${minor.class}Vector) v2).getAccessor().get(v2index) ) 
+        return true;
+      break;
+  </#list>
+</#list>
+    default:
+      break;
+    }
+    return false;
+  }
+
+  /**
+   *  Create a ValueHolder of MajorType.
+   * @param type
+   * @return
+   */
+  public static ValueHolder createValueHolder(MajorType type) {
+    switch(type.getMinorType()) {
+      <#list vv.types as type>
+      <#list type.minor as minor>
+      case ${minor.class?upper_case} :
+
+        switch (type.getMode()) {
+          case REQUIRED:
+            return new ${minor.class}Holder();
+          case OPTIONAL:
+            return new Nullable${minor.class}Holder();
+          case REPEATED:
+            return new Repeated${minor.class}Holder();
+        }
+      </#list>
+      </#list>
+      case GENERIC_OBJECT:
+        return new ObjectHolder();
+      default:
+        throw new UnsupportedOperationException(buildErrorMessage("create value holder", type));
+    }
+  }
+
+  public static boolean isNull(ValueHolder holder) {
+    MajorType type = getValueHolderType(holder);
+
+    switch(type.getMinorType()) {
+      <#list vv.types as type>
+      <#list type.minor as minor>
+      case ${minor.class?upper_case} :
+
+      switch (type.getMode()) {
+        case REQUIRED:
+          return true;
+        case OPTIONAL:
+          return ((Nullable${minor.class}Holder) holder).isSet == 0;
+        case REPEATED:
+          return true;
+      }
+      </#list>
+      </#list>
+      default:
+        throw new UnsupportedOperationException(buildErrorMessage("check is null", type));
+    }
+  }
+
+  public static ValueHolder deNullify(ValueHolder holder) {
+    MajorType type = getValueHolderType(holder);
+
+    switch(type.getMinorType()) {
+      <#list vv.types as type>
+      <#list type.minor as minor>
+      case ${minor.class?upper_case} :
+
+        switch (type.getMode()) {
+          case REQUIRED:
+            return holder;
+          case OPTIONAL:
+            if( ((Nullable${minor.class}Holder) holder).isSet == 1) {
+              ${minor.class}Holder newHolder = new ${minor.class}Holder();
+
+              <#assign fields = minor.fields!type.fields />
+              <#list fields as field>
+              newHolder.${field.name} = ((Nullable${minor.class}Holder) holder).${field.name};
+              </#list>
+
+              return newHolder;
+            } else {
+              throw new UnsupportedOperationException("You can not convert a null value into a non-null value!");
+            }
+          case REPEATED:
+            return holder;
+        }
+      </#list>
+      </#list>
+      default:
+        throw new UnsupportedOperationException(buildErrorMessage("deNullify", type));
+    }
+  }
+
+  public static ValueHolder nullify(ValueHolder holder) {
+    MajorType type = getValueHolderType(holder);
+
+    switch(type.getMinorType()) {
+      <#list vv.types as type>
+      <#list type.minor as minor>
+      case ${minor.class?upper_case} :
+        switch (type.getMode()) {
+          case REQUIRED:
+            Nullable${minor.class}Holder newHolder = new Nullable${minor.class}Holder();
+            newHolder.isSet = 1;
+            <#assign fields = minor.fields!type.fields />
+            <#list fields as field>
+            newHolder.${field.name} = ((${minor.class}Holder) holder).${field.name};
+            </#list>
+            return newHolder;
+          case OPTIONAL:
+            return holder;
+          case REPEATED:
+            throw new UnsupportedOperationException("You can not convert repeated type " + type + " to nullable type!");
+        }
+      </#list>
+      </#list>
+      default:
+        throw new UnsupportedOperationException(buildErrorMessage("nullify", type));
+    }
+  }
+
+  public static MajorType getValueHolderType(ValueHolder holder) {
+
+    if (0 == 1) {
+      return null;
+    }
+    <#list vv.types as type>
+    <#list type.minor as minor>
+      else if (holder instanceof ${minor.class}Holder) {
+        return ((${minor.class}Holder) holder).TYPE;
+      } else if (holder instanceof Nullable${minor.class}Holder) {
+      return ((Nullable${minor.class}Holder) holder).TYPE;
+      }
+    </#list>
+    </#list>
+
+    throw new UnsupportedOperationException("ValueHolder is not supported for 'getValueHolderType' method.");
+
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/ComplexCopier.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/ComplexCopier.java b/java/vector/src/main/codegen/templates/ComplexCopier.java
new file mode 100644
index 0000000..3614231
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/ComplexCopier.java
@@ -0,0 +1,133 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+<@pp.dropOutputFile />
+<@pp.changeOutputFile name="/org/apache/arrow/vector/complex/impl/ComplexCopier.java" />
+
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.complex.impl;
+
+<#include "/@includes/vv_imports.ftl" />
+
+/*
+ * This class is generated using freemarker and the ${.template_name} template.
+ */
+@SuppressWarnings("unused")
+public class ComplexCopier {
+
+  /**
+   * Do a deep copy of the value in input into output
+   * @param in
+   * @param out
+   */
+  public static void copy(FieldReader input, FieldWriter output) {
+    writeValue(input, output);
+  }
+
+  private static void writeValue(FieldReader reader, FieldWriter writer) {
+    final DataMode m = reader.getType().getMode();
+    final MinorType mt = reader.getType().getMinorType();
+
+    switch(m){
+    case OPTIONAL:
+    case REQUIRED:
+
+
+      switch (mt) {
+
+      case LIST:
+        writer.startList();
+        while (reader.next()) {
+          writeValue(reader.reader(), getListWriterForReader(reader.reader(), writer));
+        }
+        writer.endList();
+        break;
+      case MAP:
+        writer.start();
+        if (reader.isSet()) {
+          for(String name : reader){
+            FieldReader childReader = reader.reader(name);
+            if(childReader.isSet()){
+              writeValue(childReader, getMapWriterForReader(childReader, writer, name));
+            }
+          }
+        }
+        writer.end();
+        break;
+  <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first />
+  <#assign fields = minor.fields!type.fields />
+  <#assign uncappedName = name?uncap_first/>
+  <#if !minor.class?starts_with("Decimal")>
+
+      case ${name?upper_case}:
+        if (reader.isSet()) {
+          Nullable${name}Holder ${uncappedName}Holder = new Nullable${name}Holder();
+          reader.read(${uncappedName}Holder);
+          if (${uncappedName}Holder.isSet == 1) {
+            writer.write${name}(<#list fields as field>${uncappedName}Holder.${field.name}<#if field_has_next>, </#if></#list>);
+          }
+        }
+        break;
+
+  </#if>
+  </#list></#list>
+      }
+              break;
+    }
+ }
+
+  private static FieldWriter getMapWriterForReader(FieldReader reader, MapWriter writer, String name) {
+    switch (reader.getType().getMinorType()) {
+    <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first />
+    <#assign fields = minor.fields!type.fields />
+    <#assign uncappedName = name?uncap_first/>
+    <#if !minor.class?starts_with("Decimal")>
+    case ${name?upper_case}:
+      return (FieldWriter) writer.<#if name == "Int">integer<#else>${uncappedName}</#if>(name);
+    </#if>
+    </#list></#list>
+    case MAP:
+      return (FieldWriter) writer.map(name);
+    case LIST:
+      return (FieldWriter) writer.list(name);
+    default:
+      throw new UnsupportedOperationException(reader.getType().toString());
+    }
+  }
+
+  private static FieldWriter getListWriterForReader(FieldReader reader, ListWriter writer) {
+    switch (reader.getType().getMinorType()) {
+    <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first />
+    <#assign fields = minor.fields!type.fields />
+    <#assign uncappedName = name?uncap_first/>
+    <#if !minor.class?starts_with("Decimal")>
+    case ${name?upper_case}:
+    return (FieldWriter) writer.<#if name == "Int">integer<#else>${uncappedName}</#if>();
+    </#if>
+    </#list></#list>
+    case MAP:
+      return (FieldWriter) writer.map();
+    case LIST:
+      return (FieldWriter) writer.list();
+    default:
+      throw new UnsupportedOperationException(reader.getType().toString());
+    }
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/ComplexReaders.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/ComplexReaders.java b/java/vector/src/main/codegen/templates/ComplexReaders.java
new file mode 100644
index 0000000..34c6571
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/ComplexReaders.java
@@ -0,0 +1,183 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import java.lang.Override;
+import java.util.List;
+
+import org.apache.arrow.record.TransferPair;
+import org.apache.arrow.vector.complex.IndexHolder;
+import org.apache.arrow.vector.complex.writer.IntervalWriter;
+import org.apache.arrow.vector.complex.writer.BaseWriter.MapWriter;
+
+<@pp.dropOutputFile />
+<#list vv.types as type>
+<#list type.minor as minor>
+<#list ["", "Repeated"] as mode>
+<#assign lowerName = minor.class?uncap_first />
+<#if lowerName == "int" ><#assign lowerName = "integer" /></#if>
+<#assign name = mode + minor.class?cap_first />
+<#assign javaType = (minor.javaType!type.javaType) />
+<#assign friendlyType = (minor.friendlyType!minor.boxedType!type.boxedType) />
+<#assign safeType=friendlyType />
+<#if safeType=="byte[]"><#assign safeType="ByteArray" /></#if>
+
+<#assign hasFriendly = minor.friendlyType!"no" == "no" />
+
+<#list ["", "Nullable"] as nullMode>
+<#if (mode == "Repeated" && nullMode  == "") || mode == "" >
+<@pp.changeOutputFile name="/org/apache/arrow/vector/complex/impl/${nullMode}${name}ReaderImpl.java" />
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.complex.impl;
+
+<#include "/@includes/vv_imports.ftl" />
+
+@SuppressWarnings("unused")
+public class ${nullMode}${name}ReaderImpl extends AbstractFieldReader {
+  
+  private final ${nullMode}${name}Vector vector;
+  
+  public ${nullMode}${name}ReaderImpl(${nullMode}${name}Vector vector){
+    super();
+    this.vector = vector;
+  }
+
+  public MajorType getType(){
+    return vector.getField().getType();
+  }
+
+  public MaterializedField getField(){
+    return vector.getField();
+  }
+  
+  public boolean isSet(){
+    <#if nullMode == "Nullable">
+    return !vector.getAccessor().isNull(idx());
+    <#else>
+    return true;
+    </#if>
+  }
+
+
+  
+  
+  <#if mode == "Repeated">
+
+  public void copyAsValue(${minor.class?cap_first}Writer writer){
+    Repeated${minor.class?cap_first}WriterImpl impl = (Repeated${minor.class?cap_first}WriterImpl) writer;
+    impl.vector.copyFromSafe(idx(), impl.idx(), vector);
+  }
+  
+  public void copyAsField(String name, MapWriter writer){
+    Repeated${minor.class?cap_first}WriterImpl impl = (Repeated${minor.class?cap_first}WriterImpl)  writer.list(name).${lowerName}();
+    impl.vector.copyFromSafe(idx(), impl.idx(), vector);
+  }
+  
+  public int size(){
+    return vector.getAccessor().getInnerValueCountAt(idx());
+  }
+  
+  public void read(int arrayIndex, ${minor.class?cap_first}Holder h){
+    vector.getAccessor().get(idx(), arrayIndex, h);
+  }
+  public void read(int arrayIndex, Nullable${minor.class?cap_first}Holder h){
+    vector.getAccessor().get(idx(), arrayIndex, h);
+  }
+  
+  public ${friendlyType} read${safeType}(int arrayIndex){
+    return vector.getAccessor().getSingleObject(idx(), arrayIndex);
+  }
+
+  
+  public List<Object> readObject(){
+    return (List<Object>) (Object) vector.getAccessor().getObject(idx());
+  }
+  
+  <#else>
+  
+  public void copyAsValue(${minor.class?cap_first}Writer writer){
+    ${nullMode}${minor.class?cap_first}WriterImpl impl = (${nullMode}${minor.class?cap_first}WriterImpl) writer;
+    impl.vector.copyFromSafe(idx(), impl.idx(), vector);
+  }
+  
+  public void copyAsField(String name, MapWriter writer){
+    ${nullMode}${minor.class?cap_first}WriterImpl impl = (${nullMode}${minor.class?cap_first}WriterImpl) writer.${lowerName}(name);
+    impl.vector.copyFromSafe(idx(), impl.idx(), vector);
+  }
+
+  <#if nullMode != "Nullable">
+  public void read(${minor.class?cap_first}Holder h){
+    vector.getAccessor().get(idx(), h);
+  }
+  </#if>
+
+  public void read(Nullable${minor.class?cap_first}Holder h){
+    vector.getAccessor().get(idx(), h);
+  }
+  
+  public ${friendlyType} read${safeType}(){
+    return vector.getAccessor().getObject(idx());
+  }
+  
+  public void copyValue(FieldWriter w){
+    
+  }
+  
+  public Object readObject(){
+    return vector.getAccessor().getObject(idx());
+  }
+
+  
+  </#if>
+}
+</#if>
+</#list>
+<@pp.changeOutputFile name="/org/apache/arrow/vector/complex/reader/${name}Reader.java" />
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.complex.reader;
+
+<#include "/@includes/vv_imports.ftl" />
+@SuppressWarnings("unused")
+public interface ${name}Reader extends BaseReader{
+  
+  <#if mode == "Repeated">
+  public int size();
+  public void read(int arrayIndex, ${minor.class?cap_first}Holder h);
+  public void read(int arrayIndex, Nullable${minor.class?cap_first}Holder h);
+  public Object readObject(int arrayIndex);
+  public ${friendlyType} read${safeType}(int arrayIndex);
+  <#else>
+  public void read(${minor.class?cap_first}Holder h);
+  public void read(Nullable${minor.class?cap_first}Holder h);
+  public Object readObject();
+  public ${friendlyType} read${safeType}();
+  </#if>  
+  public boolean isSet();
+  public void copyAsValue(${minor.class}Writer writer);
+  public void copyAsField(String name, ${minor.class}Writer writer);
+  
+}
+
+
+
+</#list>
+</#list>
+</#list>
+
+

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/ComplexWriters.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/ComplexWriters.java b/java/vector/src/main/codegen/templates/ComplexWriters.java
new file mode 100644
index 0000000..8f9a6e7
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/ComplexWriters.java
@@ -0,0 +1,151 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+<@pp.dropOutputFile />
+<#list vv.types as type>
+<#list type.minor as minor>
+<#list ["", "Nullable", "Repeated"] as mode>
+<#assign name = mode + minor.class?cap_first />
+<#assign eName = name />
+<#assign javaType = (minor.javaType!type.javaType) />
+<#assign fields = minor.fields!type.fields />
+
+<@pp.changeOutputFile name="/org/apache/arrow/vector/complex/impl/${eName}WriterImpl.java" />
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.complex.impl;
+
+<#include "/@includes/vv_imports.ftl" />
+
+/*
+ * This class is generated using FreeMarker on the ${.template_name} template.
+ */
+@SuppressWarnings("unused")
+public class ${eName}WriterImpl extends AbstractFieldWriter {
+
+  private final ${name}Vector.Mutator mutator;
+  final ${name}Vector vector;
+
+  public ${eName}WriterImpl(${name}Vector vector, AbstractFieldWriter parent) {
+    super(parent);
+    this.mutator = vector.getMutator();
+    this.vector = vector;
+  }
+
+  @Override
+  public MaterializedField getField() {
+    return vector.getField();
+  }
+
+  @Override
+  public int getValueCapacity() {
+    return vector.getValueCapacity();
+  }
+
+  @Override
+  public void allocate() {
+    vector.allocateNew();
+  }
+
+  @Override
+  public void close() {
+    vector.close();
+  }
+
+  @Override
+  public void clear() {
+    vector.clear();
+  }
+
+  @Override
+  protected int idx() {
+    return super.idx();
+  }
+
+  <#if mode == "Repeated">
+
+  public void write(${minor.class?cap_first}Holder h) {
+    mutator.addSafe(idx(), h);
+    vector.getMutator().setValueCount(idx()+1);
+  }
+
+  public void write(Nullable${minor.class?cap_first}Holder h) {
+    mutator.addSafe(idx(), h);
+    vector.getMutator().setValueCount(idx()+1);
+  }
+
+  <#if !(minor.class == "Decimal9" || minor.class == "Decimal18" || minor.class == "Decimal28Sparse" || minor.class == "Decimal38Sparse" || minor.class == "Decimal28Dense" || minor.class == "Decimal38Dense")>
+  public void write${minor.class}(<#list fields as field>${field.type} ${field.name}<#if field_has_next>, </#if></#list>) {
+    mutator.addSafe(idx(), <#list fields as field>${field.name}<#if field_has_next>, </#if></#list>);
+    vector.getMutator().setValueCount(idx()+1);
+  }
+  </#if>
+
+  public void setPosition(int idx) {
+    super.setPosition(idx);
+    mutator.startNewValue(idx);
+  }
+
+
+  <#else>
+
+  public void write(${minor.class}Holder h) {
+    mutator.setSafe(idx(), h);
+    vector.getMutator().setValueCount(idx()+1);
+  }
+
+  public void write(Nullable${minor.class}Holder h) {
+    mutator.setSafe(idx(), h);
+    vector.getMutator().setValueCount(idx()+1);
+  }
+
+  <#if !(minor.class == "Decimal9" || minor.class == "Decimal18" || minor.class == "Decimal28Sparse" || minor.class == "Decimal38Sparse" || minor.class == "Decimal28Dense" || minor.class == "Decimal38Dense")>
+  public void write${minor.class}(<#list fields as field>${field.type} ${field.name}<#if field_has_next>, </#if></#list>) {
+    mutator.setSafe(idx(), <#if mode == "Nullable">1, </#if><#list fields as field>${field.name}<#if field_has_next>, </#if></#list>);
+    vector.getMutator().setValueCount(idx()+1);
+  }
+
+  <#if mode == "Nullable">
+
+  public void writeNull() {
+    mutator.setNull(idx());
+    vector.getMutator().setValueCount(idx()+1);
+  }
+  </#if>
+  </#if>
+  </#if>
+}
+
+<@pp.changeOutputFile name="/org/apache/arrow/vector/complex/writer/${eName}Writer.java" />
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector.complex.writer;
+
+<#include "/@includes/vv_imports.ftl" />
+@SuppressWarnings("unused")
+public interface ${eName}Writer extends BaseWriter {
+  public void write(${minor.class}Holder h);
+
+  <#if !(minor.class == "Decimal9" || minor.class == "Decimal18" || minor.class == "Decimal28Sparse" || minor.class == "Decimal38Sparse" || minor.class == "Decimal28Dense" || minor.class == "Decimal38Dense")>
+  public void write${minor.class}(<#list fields as field>${field.type} ${field.name}<#if field_has_next>, </#if></#list>);
+  </#if>
+}
+
+</#list>
+</#list>
+</#list>

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/codegen/templates/FixedValueVectors.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/codegen/templates/FixedValueVectors.java b/java/vector/src/main/codegen/templates/FixedValueVectors.java
new file mode 100644
index 0000000..18fcac9
--- /dev/null
+++ b/java/vector/src/main/codegen/templates/FixedValueVectors.java
@@ -0,0 +1,813 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import java.lang.Override;
+
+<@pp.dropOutputFile />
+<#list vv.types as type>
+<#list type.minor as minor>
+<#assign friendlyType = (minor.friendlyType!minor.boxedType!type.boxedType) />
+
+<#if type.major == "Fixed">
+<@pp.changeOutputFile name="/org/apache/arrow/vector/${minor.class}Vector.java" />
+<#include "/@includes/license.ftl" />
+
+package org.apache.arrow.vector;
+
+<#include "/@includes/vv_imports.ftl" />
+
+/**
+ * ${minor.class} implements a vector of fixed width values.  Elements in the vector are accessed
+ * by position, starting from the logical start of the vector.  Values should be pushed onto the
+ * vector sequentially, but may be randomly accessed.
+ *   The width of each element is ${type.width} byte(s)
+ *   The equivalent Java primitive is '${minor.javaType!type.javaType}'
+ *
+ * NB: this class is automatically generated from ${.template_name} and ValueVectorTypes.tdd using FreeMarker.
+ */
+public final class ${minor.class}Vector extends BaseDataValueVector implements FixedWidthVector{
+  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(${minor.class}Vector.class);
+
+  private final FieldReader reader = new ${minor.class}ReaderImpl(${minor.class}Vector.this);
+  private final Accessor accessor = new Accessor();
+  private final Mutator mutator = new Mutator();
+
+  private int allocationSizeInBytes = INITIAL_VALUE_ALLOCATION * ${type.width};
+  private int allocationMonitor = 0;
+
+  public ${minor.class}Vector(MaterializedField field, BufferAllocator allocator) {
+    super(field, allocator);
+  }
+
+  @Override
+  public FieldReader getReader(){
+    return reader;
+  }
+
+  @Override
+  public int getBufferSizeFor(final int valueCount) {
+    if (valueCount == 0) {
+      return 0;
+    }
+    return valueCount * ${type.width};
+  }
+
+  @Override
+  public int getValueCapacity(){
+    return (int) (data.capacity() *1.0 / ${type.width});
+  }
+
+  @Override
+  public Accessor getAccessor(){
+    return accessor;
+  }
+
+  @Override
+  public Mutator getMutator(){
+    return mutator;
+  }
+
+  @Override
+  public void setInitialCapacity(final int valueCount) {
+    final long size = 1L * valueCount * ${type.width};
+    if (size > MAX_ALLOCATION_SIZE) {
+      throw new OversizedAllocationException("Requested amount of memory is more than max allowed allocation size");
+    }
+    allocationSizeInBytes = (int)size;
+  }
+
+  @Override
+  public void allocateNew() {
+    if(!allocateNewSafe()){
+      throw new OutOfMemoryException("Failure while allocating buffer.");
+    }
+  }
+
+  @Override
+  public boolean allocateNewSafe() {
+    long curAllocationSize = allocationSizeInBytes;
+    if (allocationMonitor > 10) {
+      curAllocationSize = Math.max(8, curAllocationSize / 2);
+      allocationMonitor = 0;
+    } else if (allocationMonitor < -2) {
+      curAllocationSize = allocationSizeInBytes * 2L;
+      allocationMonitor = 0;
+    }
+
+    try{
+      allocateBytes(curAllocationSize);
+    } catch (RuntimeException ex) {
+      return false;
+    }
+    return true;
+  }
+
+  /**
+   * Allocate a new buffer that supports setting at least the provided number of values. May actually be sized bigger
+   * depending on underlying buffer rounding size. Must be called prior to using the ValueVector.
+   *
+   * Note that the maximum number of values a vector can allocate is Integer.MAX_VALUE / value width.
+   *
+   * @param valueCount
+   * @throws org.apache.arrow.memory.OutOfMemoryException if it can't allocate the new buffer
+   */
+  @Override
+  public void allocateNew(final int valueCount) {
+    allocateBytes(valueCount * ${type.width});
+  }
+
+  @Override
+  public void reset() {
+    allocationSizeInBytes = INITIAL_VALUE_ALLOCATION;
+    allocationMonitor = 0;
+    zeroVector();
+    super.reset();
+    }
+
+  private void allocateBytes(final long size) {
+    if (size > MAX_ALLOCATION_SIZE) {
+      throw new OversizedAllocationException("Requested amount of memory is more than max allowed allocation size");
+    }
+
+    final int curSize = (int)size;
+    clear();
+    data = allocator.buffer(curSize);
+    data.readerIndex(0);
+    allocationSizeInBytes = curSize;
+  }
+
+/**
+ * Allocate new buffer with double capacity, and copy data into the new buffer. Replace vector's buffer with new buffer, and release old one
+ *
+ * @throws org.apache.arrow.memory.OutOfMemoryException if it can't allocate the new buffer
+ */
+  public void reAlloc() {
+    final long newAllocationSize = allocationSizeInBytes * 2L;
+    if (newAllocationSize > MAX_ALLOCATION_SIZE)  {
+      throw new OversizedAllocationException("Unable to expand the buffer. Max allowed buffer size is reached.");
+    }
+
+    logger.debug("Reallocating vector [{}]. # of bytes: [{}] -> [{}]", field, allocationSizeInBytes, newAllocationSize);
+    final ArrowBuf newBuf = allocator.buffer((int)newAllocationSize);
+    newBuf.setBytes(0, data, 0, data.capacity());
+    final int halfNewCapacity = newBuf.capacity() / 2;
+    newBuf.setZero(halfNewCapacity, halfNewCapacity);
+    newBuf.writerIndex(data.writerIndex());
+    data.release(1);
+    data = newBuf;
+    allocationSizeInBytes = (int)newAllocationSize;
+  }
+
+  /**
+   * {@inheritDoc}
+   */
+  @Override
+  public void zeroVector() {
+    data.setZero(0, data.capacity());
+  }
+
+//  @Override
+//  public void load(SerializedField metadata, ArrowBuf buffer) {
+//    Preconditions.checkArgument(this.field.getPath().equals(metadata.getNamePart().getName()), "The field %s doesn't match the provided metadata %s.", this.field, metadata);
+//    final int actualLength = metadata.getBufferLength();
+//    final int valueCount = metadata.getValueCount();
+//    final int expectedLength = valueCount * ${type.width};
+//    assert actualLength == expectedLength : String.format("Expected to load %d bytes but actually loaded %d bytes", expectedLength, actualLength);
+//
+//    clear();
+//    if (data != null) {
+//      data.release(1);
+//    }
+//    data = buffer.slice(0, actualLength);
+//    data.retain(1);
+//    data.writerIndex(actualLength);
+//    }
+
+  public TransferPair getTransferPair(BufferAllocator allocator){
+    return new TransferImpl(getField(), allocator);
+  }
+
+  @Override
+  public TransferPair getTransferPair(String ref, BufferAllocator allocator){
+    return new TransferImpl(getField().withPath(ref), allocator);
+  }
+
+  @Override
+  public TransferPair makeTransferPair(ValueVector to) {
+    return new TransferImpl((${minor.class}Vector) to);
+  }
+
+  public void transferTo(${minor.class}Vector target){
+    target.clear();
+    target.data = data.transferOwnership(target.allocator).buffer;
+    target.data.writerIndex(data.writerIndex());
+    clear();
+  }
+
+  public void splitAndTransferTo(int startIndex, int length, ${minor.class}Vector target) {
+    final int startPoint = startIndex * ${type.width};
+    final int sliceLength = length * ${type.width};
+    target.clear();
+    target.data = data.slice(startPoint, sliceLength).transferOwnership(target.allocator).buffer;
+    target.data.writerIndex(sliceLength);
+  }
+
+  private class TransferImpl implements TransferPair{
+    private ${minor.class}Vector to;
+
+    public TransferImpl(MaterializedField field, BufferAllocator allocator){
+      to = new ${minor.class}Vector(field, allocator);
+    }
+
+    public TransferImpl(${minor.class}Vector to) {
+      this.to = to;
+    }
+
+    @Override
+    public ${minor.class}Vector getTo(){
+      return to;
+    }
+
+    @Override
+    public void transfer(){
+      transferTo(to);
+    }
+
+    @Override
+    public void splitAndTransfer(int startIndex, int length) {
+      splitAndTransferTo(startIndex, length, to);
+    }
+
+    @Override
+    public void copyValueSafe(int fromIndex, int toIndex) {
+      to.copyFromSafe(fromIndex, toIndex, ${minor.class}Vector.this);
+    }
+  }
+
+  public void copyFrom(int fromIndex, int thisIndex, ${minor.class}Vector from){
+    <#if (type.width > 8)>
+    from.data.getBytes(fromIndex * ${type.width}, data, thisIndex * ${type.width}, ${type.width});
+    <#else> <#-- type.width <= 8 -->
+    data.set${(minor.javaType!type.javaType)?cap_first}(thisIndex * ${type.width},
+        from.data.get${(minor.javaType!type.javaType)?cap_first}(fromIndex * ${type.width})
+    );
+    </#if> <#-- type.width -->
+  }
+
+  public void copyFromSafe(int fromIndex, int thisIndex, ${minor.class}Vector from){
+    while(thisIndex >= getValueCapacity()) {
+        reAlloc();
+    }
+    copyFrom(fromIndex, thisIndex, from);
+  }
+
+  public void decrementAllocationMonitor() {
+    if (allocationMonitor > 0) {
+      allocationMonitor = 0;
+    }
+    --allocationMonitor;
+  }
+
+  private void incrementAllocationMonitor() {
+    ++allocationMonitor;
+  }
+
+  public final class Accessor extends BaseDataValueVector.BaseAccessor {
+    @Override
+    public int getValueCount() {
+      return data.writerIndex() / ${type.width};
+    }
+
+    @Override
+    public boolean isNull(int index){
+      return false;
+    }
+
+    <#if (type.width > 8)>
+
+    public ${minor.javaType!type.javaType} get(int index) {
+      return data.slice(index * ${type.width}, ${type.width});
+    }
+
+    <#if (minor.class == "Interval")>
+    public void get(int index, ${minor.class}Holder holder){
+
+      final int offsetIndex = index * ${type.width};
+      holder.months = data.getInt(offsetIndex);
+      holder.days = data.getInt(offsetIndex + ${minor.daysOffset});
+      holder.milliseconds = data.getInt(offsetIndex + ${minor.millisecondsOffset});
+    }
+
+    public void get(int index, Nullable${minor.class}Holder holder){
+      final int offsetIndex = index * ${type.width};
+      holder.isSet = 1;
+      holder.months = data.getInt(offsetIndex);
+      holder.days = data.getInt(offsetIndex + ${minor.daysOffset});
+      holder.milliseconds = data.getInt(offsetIndex + ${minor.millisecondsOffset});
+    }
+
+    @Override
+    public ${friendlyType} getObject(int index) {
+      final int offsetIndex = index * ${type.width};
+      final int months  = data.getInt(offsetIndex);
+      final int days    = data.getInt(offsetIndex + ${minor.daysOffset});
+      final int millis = data.getInt(offsetIndex + ${minor.millisecondsOffset});
+      final Period p = new Period();
+      return p.plusMonths(months).plusDays(days).plusMillis(millis);
+    }
+
+    public StringBuilder getAsStringBuilder(int index) {
+
+      final int offsetIndex = index * ${type.width};
+
+      int months  = data.getInt(offsetIndex);
+      final int days    = data.getInt(offsetIndex + ${minor.daysOffset});
+      int millis = data.getInt(offsetIndex + ${minor.millisecondsOffset});
+
+      final int years  = (months / org.apache.arrow.vector.util.DateUtility.yearsToMonths);
+      months = (months % org.apache.arrow.vector.util.DateUtility.yearsToMonths);
+
+      final int hours  = millis / (org.apache.arrow.vector.util.DateUtility.hoursToMillis);
+      millis     = millis % (org.apache.arrow.vector.util.DateUtility.hoursToMillis);
+
+      final int minutes = millis / (org.apache.arrow.vector.util.DateUtility.minutesToMillis);
+      millis      = millis % (org.apache.arrow.vector.util.DateUtility.minutesToMillis);
+
+      final long seconds = millis / (org.apache.arrow.vector.util.DateUtility.secondsToMillis);
+      millis      = millis % (org.apache.arrow.vector.util.DateUtility.secondsToMillis);
+
+      final String yearString = (Math.abs(years) == 1) ? " year " : " years ";
+      final String monthString = (Math.abs(months) == 1) ? " month " : " months ";
+      final String dayString = (Math.abs(days) == 1) ? " day " : " days ";
+
+
+      return(new StringBuilder().
+             append(years).append(yearString).
+             append(months).append(monthString).
+             append(days).append(dayString).
+             append(hours).append(":").
+             append(minutes).append(":").
+             append(seconds).append(".").
+             append(millis));
+    }
+
+    <#elseif (minor.class == "IntervalDay")>
+    public void get(int index, ${minor.class}Holder holder){
+
+      final int offsetIndex = index * ${type.width};
+      holder.days = data.getInt(offsetIndex);
+      holder.milliseconds = data.getInt(offsetIndex + ${minor.millisecondsOffset});
+    }
+
+    public void get(int index, Nullable${minor.class}Holder holder){
+      final int offsetIndex = index * ${type.width};
+      holder.isSet = 1;
+      holder.days = data.getInt(offsetIndex);
+      holder.milliseconds = data.getInt(offsetIndex + ${minor.millisecondsOffset});
+    }
+
+    @Override
+    public ${friendlyType} getObject(int index) {
+      final int offsetIndex = index * ${type.width};
+      final int millis = data.getInt(offsetIndex + ${minor.millisecondsOffset});
+      final int  days   = data.getInt(offsetIndex);
+      final Period p = new Period();
+      return p.plusDays(days).plusMillis(millis);
+    }
+
+
+    public StringBuilder getAsStringBuilder(int index) {
+      final int offsetIndex = index * ${type.width};
+
+      int millis = data.getInt(offsetIndex + ${minor.millisecondsOffset});
+      final int  days   = data.getInt(offsetIndex);
+
+      final int hours  = millis / (org.apache.arrow.vector.util.DateUtility.hoursToMillis);
+      millis     = millis % (org.apache.arrow.vector.util.DateUtility.hoursToMillis);
+
+      final int minutes = millis / (org.apache.arrow.vector.util.DateUtility.minutesToMillis);
+      millis      = millis % (org.apache.arrow.vector.util.DateUtility.minutesToMillis);
+
+      final int seconds = millis / (org.apache.arrow.vector.util.DateUtility.secondsToMillis);
+      millis      = millis % (org.apache.arrow.vector.util.DateUtility.secondsToMillis);
+
+      final String dayString = (Math.abs(days) == 1) ? " day " : " days ";
+
+      return(new StringBuilder().
+              append(days).append(dayString).
+              append(hours).append(":").
+              append(minutes).append(":").
+              append(seconds).append(".").
+              append(millis));
+    }
+
+    <#elseif (minor.class == "Decimal28Sparse") || (minor.class == "Decimal38Sparse") || (minor.class == "Decimal28Dense") || (minor.class == "Decimal38Dense")>
+
+    public void get(int index, ${minor.class}Holder holder) {
+        holder.start = index * ${type.width};
+        holder.buffer = data;
+        holder.scale = getField().getScale();
+        holder.precision = getField().getPrecision();
+    }
+
+    public void get(int index, Nullable${minor.class}Holder holder) {
+        holder.isSet = 1;
+        holder.start = index * ${type.width};
+        holder.buffer = data;
+        holder.scale = getField().getScale();
+        holder.precision = getField().getPrecision();
+    }
+
+      @Override
+      public ${friendlyType} getObject(int index) {
+      <#if (minor.class == "Decimal28Sparse") || (minor.class == "Decimal38Sparse")>
+      // Get the BigDecimal object
+      return org.apache.arrow.vector.util.DecimalUtility.getBigDecimalFromSparse(data, index * ${type.width}, ${minor.nDecimalDigits}, getField().getScale());
+      <#else>
+      return org.apache.arrow.vector.util.DecimalUtility.getBigDecimalFromDense(data, index * ${type.width}, ${minor.nDecimalDigits}, getField().getScale(), ${minor.maxPrecisionDigits}, ${type.width});
+      </#if>
+    }
+
+    <#else>
+    public void get(int index, ${minor.class}Holder holder){
+      holder.buffer = data;
+      holder.start = index * ${type.width};
+    }
+
+    public void get(int index, Nullable${minor.class}Holder holder){
+      holder.isSet = 1;
+      holder.buffer = data;
+      holder.start = index * ${type.width};
+    }
+
+    @Override
+    public ${friendlyType} getObject(int index) {
+      return data.slice(index * ${type.width}, ${type.width})
+    }
+
+    </#if>
+    <#else> <#-- type.width <= 8 -->
+
+    public ${minor.javaType!type.javaType} get(int index) {
+      return data.get${(minor.javaType!type.javaType)?cap_first}(index * ${type.width});
+    }
+
+    <#if type.width == 4>
+    public long getTwoAsLong(int index) {
+      return data.getLong(index * ${type.width});
+    }
+
+    </#if>
+
+    <#if minor.class == "Date">
+    @Override
+    public ${friendlyType} getObject(int index) {
+        org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), org.joda.time.DateTimeZone.UTC);
+        date = date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
+        return date;
+    }
+
+    <#elseif minor.class == "TimeStamp">
+    @Override
+    public ${friendlyType} getObject(int index) {
+        org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), org.joda.time.DateTimeZone.UTC);
+        date = date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
+        return date;
+    }
+
+    <#elseif minor.class == "IntervalYear">
+    @Override
+    public ${friendlyType} getObject(int index) {
+
+      final int value = get(index);
+
+      final int years  = (value / org.apache.arrow.vector.util.DateUtility.yearsToMonths);
+      final int months = (value % org.apache.arrow.vector.util.DateUtility.yearsToMonths);
+      final Period p = new Period();
+      return p.plusYears(years).plusMonths(months);
+    }
+
+    public StringBuilder getAsStringBuilder(int index) {
+
+      int months  = data.getInt(index);
+
+      final int years  = (months / org.apache.arrow.vector.util.DateUtility.yearsToMonths);
+      months = (months % org.apache.arrow.vector.util.DateUtility.yearsToMonths);
+
+      final String yearString = (Math.abs(years) == 1) ? " year " : " years ";
+      final String monthString = (Math.abs(months) == 1) ? " month " : " months ";
+
+      return(new StringBuilder().
+             append(years).append(yearString).
+             append(months).append(monthString));
+    }
+
+    <#elseif minor.class == "Time">
+    @Override
+    public DateTime getObject(int index) {
+
+        org.joda.time.DateTime time = new org.joda.time.DateTime(get(index), org.joda.time.DateTimeZone.UTC);
+        time = time.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
+        return time;
+    }
+
+    <#elseif minor.class == "Decimal9" || minor.class == "Decimal18">
+    @Override
+    public ${friendlyType} getObject(int index) {
+
+        final BigInteger value = BigInteger.valueOf(((${type.boxedType})get(index)).${type.javaType}Value());
+        return new BigDecimal(value, getField().getScale());
+    }
+
+    <#else>
+    @Override
+    public ${friendlyType} getObject(int index) {
+      return get(index);
+    }
+    public ${minor.javaType!type.javaType} getPrimitiveObject(int index) {
+      return get(index);
+    }
+    </#if>
+
+    public void get(int index, ${minor.class}Holder holder){
+      <#if minor.class.startsWith("Decimal")>
+      holder.scale = getField().getScale();
+      holder.precision = getField().getPrecision();
+      </#if>
+
+      holder.value = data.get${(minor.javaType!type.javaType)?cap_first}(index * ${type.width});
+    }
+
+    public void get(int index, Nullable${minor.class}Holder holder){
+      holder.isSet = 1;
+      holder.value = data.get${(minor.javaType!type.javaType)?cap_first}(index * ${type.width});
+    }
+
+
+   </#if> <#-- type.width -->
+ }
+
+ /**
+  * ${minor.class}.Mutator implements a mutable vector of fixed width values.  Elements in the
+  * vector are accessed by position from the logical start of the vector.  Values should be pushed
+  * onto the vector sequentially, but may be randomly accessed.
+  *   The width of each element is ${type.width} byte(s)
+  *   The equivalent Java primitive is '${minor.javaType!type.javaType}'
+  *
+  * NB: this class is automatically generated from ValueVectorTypes.tdd using FreeMarker.
+  */
+  public final class Mutator extends BaseDataValueVector.BaseMutator {
+
+    private Mutator(){};
+   /**
+    * Set the element at the given index to the given value.  Note that widths smaller than
+    * 32 bits are handled by the ArrowBuf interface.
+    *
+    * @param index   position of the bit to set
+    * @param value   value to set
+    */
+  <#if (type.width > 8)>
+   public void set(int index, <#if (type.width > 4)>${minor.javaType!type.javaType}<#else>int</#if> value) {
+     data.setBytes(index * ${type.width}, value, 0, ${type.width});
+   }
+
+   public void setSafe(int index, <#if (type.width > 4)>${minor.javaType!type.javaType}<#else>int</#if> value) {
+     while(index >= getValueCapacity()) {
+       reAlloc();
+     }
+     data.setBytes(index * ${type.width}, value, 0, ${type.width});
+   }
+
+  <#if (minor.class == "Interval")>
+   public void set(int index, int months, int days, int milliseconds){
+     final int offsetIndex = index * ${type.width};
+     data.setInt(offsetIndex, months);
+     data.setInt((offsetIndex + ${minor.daysOffset}), days);
+     data.setInt((offsetIndex + ${minor.millisecondsOffset}), milliseconds);
+   }
+
+   protected void set(int index, ${minor.class}Holder holder){
+     set(index, holder.months, holder.days, holder.milliseconds);
+   }
+
+   protected void set(int index, Nullable${minor.class}Holder holder){
+     set(index, holder.months, holder.days, holder.milliseconds);
+   }
+
+   public void setSafe(int index, int months, int days, int milliseconds){
+     while(index >= getValueCapacity()) {
+       reAlloc();
+     }
+     set(index, months, days, milliseconds);
+   }
+
+   public void setSafe(int index, Nullable${minor.class}Holder holder){
+     setSafe(index, holder.months, holder.days, holder.milliseconds);
+   }
+
+   public void setSafe(int index, ${minor.class}Holder holder){
+     setSafe(index, holder.months, holder.days, holder.milliseconds);
+   }
+
+   <#elseif (minor.class == "IntervalDay")>
+   public void set(int index, int days, int milliseconds){
+     final int offsetIndex = index * ${type.width};
+     data.setInt(offsetIndex, days);
+     data.setInt((offsetIndex + ${minor.millisecondsOffset}), milliseconds);
+   }
+
+   protected void set(int index, ${minor.class}Holder holder){
+     set(index, holder.days, holder.milliseconds);
+   }
+   protected void set(int index, Nullable${minor.class}Holder holder){
+     set(index, holder.days, holder.milliseconds);
+   }
+
+   public void setSafe(int index, int days, int milliseconds){
+     while(index >= getValueCapacity()) {
+       reAlloc();
+     }
+     set(index, days, milliseconds);
+   }
+
+   public void setSafe(int index, ${minor.class}Holder holder){
+     setSafe(index, holder.days, holder.milliseconds);
+   }
+
+   public void setSafe(int index, Nullable${minor.class}Holder holder){
+     setSafe(index, holder.days, holder.milliseconds);
+   }
+
+   <#elseif (minor.class == "Decimal28Sparse" || minor.class == "Decimal38Sparse") || (minor.class == "Decimal28Dense") || (minor.class == "Decimal38Dense")>
+
+   public void set(int index, ${minor.class}Holder holder){
+     set(index, holder.start, holder.buffer);
+   }
+
+   void set(int index, Nullable${minor.class}Holder holder){
+     set(index, holder.start, holder.buffer);
+   }
+
+   public void setSafe(int index,  Nullable${minor.class}Holder holder){
+     setSafe(index, holder.start, holder.buffer);
+   }
+   public void setSafe(int index,  ${minor.class}Holder holder){
+     setSafe(index, holder.start, holder.buffer);
+   }
+
+   public void setSafe(int index, int start, ArrowBuf buffer){
+     while(index >= getValueCapacity()) {
+       reAlloc();
+     }
+     set(index, start, buffer);
+   }
+
+   public void set(int index, int start, ArrowBuf buffer){
+     data.setBytes(index * ${type.width}, buffer, start, ${type.width});
+   }
+
+   <#else>
+
+   protected void set(int index, ${minor.class}Holder holder){
+     set(index, holder.start, holder.buffer);
+   }
+
+   public void set(int index, Nullable${minor.class}Holder holder){
+     set(index, holder.start, holder.buffer);
+   }
+
+   public void set(int index, int start, ArrowBuf buffer){
+     data.setBytes(index * ${type.width}, buffer, start, ${type.width});
+   }
+
+   public void setSafe(int index, ${minor.class}Holder holder){
+     setSafe(index, holder.start, holder.buffer);
+   }
+   public void setSafe(int index, Nullable${minor.class}Holder holder){
+     setSafe(index, holder.start, holder.buffer);
+   }
+
+   public void setSafe(int index, int start, ArrowBuf buffer){
+     while(index >= getValueCapacity()) {
+       reAlloc();
+     }
+     set(index, holder);
+   }
+
+   public void set(int index, Nullable${minor.class}Holder holder){
+     data.setBytes(index * ${type.width}, holder.buffer, holder.start, ${type.width});
+   }
+   </#if>
+
+   @Override
+   public void generateTestData(int count) {
+     setValueCount(count);
+     boolean even = true;
+     final int valueCount = getAccessor().getValueCount();
+     for(int i = 0; i < valueCount; i++, even = !even) {
+       final byte b = even ? Byte.MIN_VALUE : Byte.MAX_VALUE;
+       for(int w = 0; w < ${type.width}; w++){
+         data.setByte(i + w, b);
+       }
+     }
+   }
+
+   <#else> <#-- type.width <= 8 -->
+   public void set(int index, <#if (type.width >= 4)>${minor.javaType!type.javaType}<#else>int</#if> value) {
+     data.set${(minor.javaType!type.javaType)?cap_first}(index * ${type.width}, value);
+   }
+
+   public void setSafe(int index, <#if (type.width >= 4)>${minor.javaType!type.javaType}<#else>int</#if> value) {
+     while(index >= getValueCapacity()) {
+       reAlloc();
+     }
+     set(index, value);
+   }
+
+   protected void set(int index, ${minor.class}Holder holder){
+     data.set${(minor.javaType!type.javaType)?cap_first}(index * ${type.width}, holder.value);
+   }
+
+   public void setSafe(int index, ${minor.class}Holder holder){
+     while(index >= getValueCapacity()) {
+       reAlloc();
+     }
+     set(index, holder);
+   }
+
+   protected void set(int index, Nullable${minor.class}Holder holder){
+     data.set${(minor.javaType!type.javaType)?cap_first}(index * ${type.width}, holder.value);
+   }
+
+   public void setSafe(int index, Nullable${minor.class}Holder holder){
+     while(index >= getValueCapacity()) {
+       reAlloc();
+     }
+     set(index, holder);
+   }
+
+   @Override
+   public void generateTestData(int size) {
+     setValueCount(size);
+     boolean even = true;
+     final int valueCount = getAccessor().getValueCount();
+     for(int i = 0; i < valueCount; i++, even = !even) {
+       if(even){
+         set(i, ${minor.boxedType!type.boxedType}.MIN_VALUE);
+       }else{
+         set(i, ${minor.boxedType!type.boxedType}.MAX_VALUE);
+       }
+     }
+   }
+
+   public void generateTestDataAlt(int size) {
+     setValueCount(size);
+     boolean even = true;
+     final int valueCount = getAccessor().getValueCount();
+     for(int i = 0; i < valueCount; i++, even = !even) {
+       if(even){
+         set(i, (${(minor.javaType!type.javaType)}) 1);
+       }else{
+         set(i, (${(minor.javaType!type.javaType)}) 0);
+       }
+     }
+   }
+
+  </#if> <#-- type.width -->
+
+   @Override
+   public void setValueCount(int valueCount) {
+     final int currentValueCapacity = getValueCapacity();
+     final int idx = (${type.width} * valueCount);
+     while(valueCount > getValueCapacity()) {
+       reAlloc();
+     }
+     if (valueCount > 0 && currentValueCapacity > valueCount * 2) {
+       incrementAllocationMonitor();
+     } else if (allocationMonitor > 0) {
+       allocationMonitor = 0;
+     }
+     VectorTrimmer.trim(data, idx);
+     data.writerIndex(valueCount * ${type.width});
+   }
+ }
+}
+
+</#if> <#-- type.major -->
+</#list>
+</#list>


[16/17] arrow git commit: ARROW-4: This provides an partial C++11 implementation of the Apache Arrow data structures along with a cmake-based build system. The codebase generally follows Google C++ style guide, but more cleaning to be more conforming is

Posted by ja...@apache.org.
http://git-wip-us.apache.org/repos/asf/arrow/blob/23c4b08d/cpp/build-support/cpplint.py
----------------------------------------------------------------------
diff --git a/cpp/build-support/cpplint.py b/cpp/build-support/cpplint.py
new file mode 100755
index 0000000..ccc25d4
--- /dev/null
+++ b/cpp/build-support/cpplint.py
@@ -0,0 +1,6323 @@
+#!/usr/bin/env python
+#
+# Copyright (c) 2009 Google Inc. All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are
+# met:
+#
+#    * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+#    * Redistributions in binary form must reproduce the above
+# copyright notice, this list of conditions and the following disclaimer
+# in the documentation and/or other materials provided with the
+# distribution.
+#    * Neither the name of Google Inc. nor the names of its
+# contributors may be used to endorse or promote products derived from
+# this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+"""Does google-lint on c++ files.
+
+The goal of this script is to identify places in the code that *may*
+be in non-compliance with google style.  It does not attempt to fix
+up these problems -- the point is to educate.  It does also not
+attempt to find all problems, or to ensure that everything it does
+find is legitimately a problem.
+
+In particular, we can get very confused by /* and // inside strings!
+We do a small hack, which is to ignore //'s with "'s after them on the
+same line, but it is far from perfect (in either direction).
+"""
+
+import codecs
+import copy
+import getopt
+import math  # for log
+import os
+import re
+import sre_compile
+import string
+import sys
+import unicodedata
+
+
+_USAGE = """
+Syntax: cpplint.py [--verbose=#] [--output=vs7] [--filter=-x,+y,...]
+                   [--counting=total|toplevel|detailed] [--root=subdir]
+                   [--linelength=digits]
+        <file> [file] ...
+
+  The style guidelines this tries to follow are those in
+    http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml
+
+  Every problem is given a confidence score from 1-5, with 5 meaning we are
+  certain of the problem, and 1 meaning it could be a legitimate construct.
+  This will miss some errors, and is not a substitute for a code review.
+
+  To suppress false-positive errors of a certain category, add a
+  'NOLINT(category)' comment to the line.  NOLINT or NOLINT(*)
+  suppresses errors of all categories on that line.
+
+  The files passed in will be linted; at least one file must be provided.
+  Default linted extensions are .cc, .cpp, .cu, .cuh and .h.  Change the
+  extensions with the --extensions flag.
+
+  Flags:
+
+    output=vs7
+      By default, the output is formatted to ease emacs parsing.  Visual Studio
+      compatible output (vs7) may also be used.  Other formats are unsupported.
+
+    verbose=#
+      Specify a number 0-5 to restrict errors to certain verbosity levels.
+
+    filter=-x,+y,...
+      Specify a comma-separated list of category-filters to apply: only
+      error messages whose category names pass the filters will be printed.
+      (Category names are printed with the message and look like
+      "[whitespace/indent]".)  Filters are evaluated left to right.
+      "-FOO" and "FOO" means "do not print categories that start with FOO".
+      "+FOO" means "do print categories that start with FOO".
+
+      Examples: --filter=-whitespace,+whitespace/braces
+                --filter=whitespace,runtime/printf,+runtime/printf_format
+                --filter=-,+build/include_what_you_use
+
+      To see a list of all the categories used in cpplint, pass no arg:
+         --filter=
+
+    counting=total|toplevel|detailed
+      The total number of errors found is always printed. If
+      'toplevel' is provided, then the count of errors in each of
+      the top-level categories like 'build' and 'whitespace' will
+      also be printed. If 'detailed' is provided, then a count
+      is provided for each category like 'build/class'.
+
+    root=subdir
+      The root directory used for deriving header guard CPP variable.
+      By default, the header guard CPP variable is calculated as the relative
+      path to the directory that contains .git, .hg, or .svn.  When this flag
+      is specified, the relative path is calculated from the specified
+      directory. If the specified directory does not exist, this flag is
+      ignored.
+
+      Examples:
+        Assuming that src/.git exists, the header guard CPP variables for
+        src/chrome/browser/ui/browser.h are:
+
+        No flag => CHROME_BROWSER_UI_BROWSER_H_
+        --root=chrome => BROWSER_UI_BROWSER_H_
+        --root=chrome/browser => UI_BROWSER_H_
+
+    linelength=digits
+      This is the allowed line length for the project. The default value is
+      80 characters.
+
+      Examples:
+        --linelength=120
+
+    extensions=extension,extension,...
+      The allowed file extensions that cpplint will check
+
+      Examples:
+        --extensions=hpp,cpp
+
+    cpplint.py supports per-directory configurations specified in CPPLINT.cfg
+    files. CPPLINT.cfg file can contain a number of key=value pairs.
+    Currently the following options are supported:
+
+      set noparent
+      filter=+filter1,-filter2,...
+      exclude_files=regex
+      linelength=80
+
+    "set noparent" option prevents cpplint from traversing directory tree
+    upwards looking for more .cfg files in parent directories. This option
+    is usually placed in the top-level project directory.
+
+    The "filter" option is similar in function to --filter flag. It specifies
+    message filters in addition to the |_DEFAULT_FILTERS| and those specified
+    through --filter command-line flag.
+
+    "exclude_files" allows to specify a regular expression to be matched against
+    a file name. If the expression matches, the file is skipped and not run
+    through liner.
+
+    "linelength" allows to specify the allowed line length for the project.
+
+    CPPLINT.cfg has an effect on files in the same directory and all
+    sub-directories, unless overridden by a nested configuration file.
+
+      Example file:
+        filter=-build/include_order,+build/include_alpha
+        exclude_files=.*\.cc
+
+    The above example disables build/include_order warning and enables
+    build/include_alpha as well as excludes all .cc from being
+    processed by linter, in the current directory (where the .cfg
+    file is located) and all sub-directories.
+"""
+
+# We categorize each error message we print.  Here are the categories.
+# We want an explicit list so we can list them all in cpplint --filter=.
+# If you add a new error message with a new category, add it to the list
+# here!  cpplint_unittest.py should tell you if you forget to do this.
+_ERROR_CATEGORIES = [
+    'build/class',
+    'build/c++11',
+    'build/deprecated',
+    'build/endif_comment',
+    'build/explicit_make_pair',
+    'build/forward_decl',
+    'build/header_guard',
+    'build/include',
+    'build/include_alpha',
+    'build/include_order',
+    'build/include_what_you_use',
+    'build/namespaces',
+    'build/printf_format',
+    'build/storage_class',
+    'legal/copyright',
+    'readability/alt_tokens',
+    'readability/braces',
+    'readability/casting',
+    'readability/check',
+    'readability/constructors',
+    'readability/fn_size',
+    'readability/function',
+    'readability/inheritance',
+    'readability/multiline_comment',
+    'readability/multiline_string',
+    'readability/namespace',
+    'readability/nolint',
+    'readability/nul',
+    'readability/strings',
+    'readability/todo',
+    'readability/utf8',
+    'runtime/arrays',
+    'runtime/casting',
+    'runtime/explicit',
+    'runtime/int',
+    'runtime/init',
+    'runtime/invalid_increment',
+    'runtime/member_string_references',
+    'runtime/memset',
+    'runtime/indentation_namespace',
+    'runtime/operator',
+    'runtime/printf',
+    'runtime/printf_format',
+    'runtime/references',
+    'runtime/string',
+    'runtime/threadsafe_fn',
+    'runtime/vlog',
+    'whitespace/blank_line',
+    'whitespace/braces',
+    'whitespace/comma',
+    'whitespace/comments',
+    'whitespace/empty_conditional_body',
+    'whitespace/empty_loop_body',
+    'whitespace/end_of_line',
+    'whitespace/ending_newline',
+    'whitespace/forcolon',
+    'whitespace/indent',
+    'whitespace/line_length',
+    'whitespace/newline',
+    'whitespace/operators',
+    'whitespace/parens',
+    'whitespace/semicolon',
+    'whitespace/tab',
+    'whitespace/todo',
+    ]
+
+# These error categories are no longer enforced by cpplint, but for backwards-
+# compatibility they may still appear in NOLINT comments.
+_LEGACY_ERROR_CATEGORIES = [
+    'readability/streams',
+    ]
+
+# The default state of the category filter. This is overridden by the --filter=
+# flag. By default all errors are on, so only add here categories that should be
+# off by default (i.e., categories that must be enabled by the --filter= flags).
+# All entries here should start with a '-' or '+', as in the --filter= flag.
+_DEFAULT_FILTERS = ['-build/include_alpha']
+
+# We used to check for high-bit characters, but after much discussion we
+# decided those were OK, as long as they were in UTF-8 and didn't represent
+# hard-coded international strings, which belong in a separate i18n file.
+
+# C++ headers
+_CPP_HEADERS = frozenset([
+    # Legacy
+    'algobase.h',
+    'algo.h',
+    'alloc.h',
+    'builtinbuf.h',
+    'bvector.h',
+    'complex.h',
+    'defalloc.h',
+    'deque.h',
+    'editbuf.h',
+    'fstream.h',
+    'function.h',
+    'hash_map',
+    'hash_map.h',
+    'hash_set',
+    'hash_set.h',
+    'hashtable.h',
+    'heap.h',
+    'indstream.h',
+    'iomanip.h',
+    'iostream.h',
+    'istream.h',
+    'iterator.h',
+    'list.h',
+    'map.h',
+    'multimap.h',
+    'multiset.h',
+    'ostream.h',
+    'pair.h',
+    'parsestream.h',
+    'pfstream.h',
+    'procbuf.h',
+    'pthread_alloc',
+    'pthread_alloc.h',
+    'rope',
+    'rope.h',
+    'ropeimpl.h',
+    'set.h',
+    'slist',
+    'slist.h',
+    'stack.h',
+    'stdiostream.h',
+    'stl_alloc.h',
+    'stl_relops.h',
+    'streambuf.h',
+    'stream.h',
+    'strfile.h',
+    'strstream.h',
+    'tempbuf.h',
+    'tree.h',
+    'type_traits.h',
+    'vector.h',
+    # 17.6.1.2 C++ library headers
+    'algorithm',
+    'array',
+    'atomic',
+    'bitset',
+    'chrono',
+    'codecvt',
+    'complex',
+    'condition_variable',
+    'deque',
+    'exception',
+    'forward_list',
+    'fstream',
+    'functional',
+    'future',
+    'initializer_list',
+    'iomanip',
+    'ios',
+    'iosfwd',
+    'iostream',
+    'istream',
+    'iterator',
+    'limits',
+    'list',
+    'locale',
+    'map',
+    'memory',
+    'mutex',
+    'new',
+    'numeric',
+    'ostream',
+    'queue',
+    'random',
+    'ratio',
+    'regex',
+    'set',
+    'sstream',
+    'stack',
+    'stdexcept',
+    'streambuf',
+    'string',
+    'strstream',
+    'system_error',
+    'thread',
+    'tuple',
+    'typeindex',
+    'typeinfo',
+    'type_traits',
+    'unordered_map',
+    'unordered_set',
+    'utility',
+    'valarray',
+    'vector',
+    # 17.6.1.2 C++ headers for C library facilities
+    'cassert',
+    'ccomplex',
+    'cctype',
+    'cerrno',
+    'cfenv',
+    'cfloat',
+    'cinttypes',
+    'ciso646',
+    'climits',
+    'clocale',
+    'cmath',
+    'csetjmp',
+    'csignal',
+    'cstdalign',
+    'cstdarg',
+    'cstdbool',
+    'cstddef',
+    'cstdint',
+    'cstdio',
+    'cstdlib',
+    'cstring',
+    'ctgmath',
+    'ctime',
+    'cuchar',
+    'cwchar',
+    'cwctype',
+    ])
+
+
+# These headers are excluded from [build/include] and [build/include_order]
+# checks:
+# - Anything not following google file name conventions (containing an
+#   uppercase character, such as Python.h or nsStringAPI.h, for example).
+# - Lua headers.
+_THIRD_PARTY_HEADERS_PATTERN = re.compile(
+    r'^(?:[^/]*[A-Z][^/]*\.h|lua\.h|lauxlib\.h|lualib\.h)$')
+
+
+# Assertion macros.  These are defined in base/logging.h and
+# testing/base/gunit.h.  Note that the _M versions need to come first
+# for substring matching to work.
+_CHECK_MACROS = [
+    'DCHECK', 'CHECK',
+    'EXPECT_TRUE_M', 'EXPECT_TRUE',
+    'ASSERT_TRUE_M', 'ASSERT_TRUE',
+    'EXPECT_FALSE_M', 'EXPECT_FALSE',
+    'ASSERT_FALSE_M', 'ASSERT_FALSE',
+    ]
+
+# Replacement macros for CHECK/DCHECK/EXPECT_TRUE/EXPECT_FALSE
+_CHECK_REPLACEMENT = dict([(m, {}) for m in _CHECK_MACROS])
+
+for op, replacement in [('==', 'EQ'), ('!=', 'NE'),
+                        ('>=', 'GE'), ('>', 'GT'),
+                        ('<=', 'LE'), ('<', 'LT')]:
+  _CHECK_REPLACEMENT['DCHECK'][op] = 'DCHECK_%s' % replacement
+  _CHECK_REPLACEMENT['CHECK'][op] = 'CHECK_%s' % replacement
+  _CHECK_REPLACEMENT['EXPECT_TRUE'][op] = 'EXPECT_%s' % replacement
+  _CHECK_REPLACEMENT['ASSERT_TRUE'][op] = 'ASSERT_%s' % replacement
+  _CHECK_REPLACEMENT['EXPECT_TRUE_M'][op] = 'EXPECT_%s_M' % replacement
+  _CHECK_REPLACEMENT['ASSERT_TRUE_M'][op] = 'ASSERT_%s_M' % replacement
+
+for op, inv_replacement in [('==', 'NE'), ('!=', 'EQ'),
+                            ('>=', 'LT'), ('>', 'LE'),
+                            ('<=', 'GT'), ('<', 'GE')]:
+  _CHECK_REPLACEMENT['EXPECT_FALSE'][op] = 'EXPECT_%s' % inv_replacement
+  _CHECK_REPLACEMENT['ASSERT_FALSE'][op] = 'ASSERT_%s' % inv_replacement
+  _CHECK_REPLACEMENT['EXPECT_FALSE_M'][op] = 'EXPECT_%s_M' % inv_replacement
+  _CHECK_REPLACEMENT['ASSERT_FALSE_M'][op] = 'ASSERT_%s_M' % inv_replacement
+
+# Alternative tokens and their replacements.  For full list, see section 2.5
+# Alternative tokens [lex.digraph] in the C++ standard.
+#
+# Digraphs (such as '%:') are not included here since it's a mess to
+# match those on a word boundary.
+_ALT_TOKEN_REPLACEMENT = {
+    'and': '&&',
+    'bitor': '|',
+    'or': '||',
+    'xor': '^',
+    'compl': '~',
+    'bitand': '&',
+    'and_eq': '&=',
+    'or_eq': '|=',
+    'xor_eq': '^=',
+    'not': '!',
+    'not_eq': '!='
+    }
+
+# Compile regular expression that matches all the above keywords.  The "[ =()]"
+# bit is meant to avoid matching these keywords outside of boolean expressions.
+#
+# False positives include C-style multi-line comments and multi-line strings
+# but those have always been troublesome for cpplint.
+_ALT_TOKEN_REPLACEMENT_PATTERN = re.compile(
+    r'[ =()](' + ('|'.join(_ALT_TOKEN_REPLACEMENT.keys())) + r')(?=[ (]|$)')
+
+
+# These constants define types of headers for use with
+# _IncludeState.CheckNextIncludeOrder().
+_C_SYS_HEADER = 1
+_CPP_SYS_HEADER = 2
+_LIKELY_MY_HEADER = 3
+_POSSIBLE_MY_HEADER = 4
+_OTHER_HEADER = 5
+
+# These constants define the current inline assembly state
+_NO_ASM = 0       # Outside of inline assembly block
+_INSIDE_ASM = 1   # Inside inline assembly block
+_END_ASM = 2      # Last line of inline assembly block
+_BLOCK_ASM = 3    # The whole block is an inline assembly block
+
+# Match start of assembly blocks
+_MATCH_ASM = re.compile(r'^\s*(?:asm|_asm|__asm|__asm__)'
+                        r'(?:\s+(volatile|__volatile__))?'
+                        r'\s*[{(]')
+
+
+_regexp_compile_cache = {}
+
+# {str, set(int)}: a map from error categories to sets of linenumbers
+# on which those errors are expected and should be suppressed.
+_error_suppressions = {}
+
+# The root directory used for deriving header guard CPP variable.
+# This is set by --root flag.
+_root = None
+
+# The allowed line length of files.
+# This is set by --linelength flag.
+_line_length = 80
+
+# The allowed extensions for file names
+# This is set by --extensions flag.
+_valid_extensions = set(['cc', 'h', 'cpp', 'cu', 'cuh'])
+
+def ParseNolintSuppressions(filename, raw_line, linenum, error):
+  """Updates the global list of error-suppressions.
+
+  Parses any NOLINT comments on the current line, updating the global
+  error_suppressions store.  Reports an error if the NOLINT comment
+  was malformed.
+
+  Args:
+    filename: str, the name of the input file.
+    raw_line: str, the line of input text, with comments.
+    linenum: int, the number of the current line.
+    error: function, an error handler.
+  """
+  matched = Search(r'\bNOLINT(NEXTLINE)?\b(\([^)]+\))?', raw_line)
+  if matched:
+    if matched.group(1):
+      suppressed_line = linenum + 1
+    else:
+      suppressed_line = linenum
+    category = matched.group(2)
+    if category in (None, '(*)'):  # => "suppress all"
+      _error_suppressions.setdefault(None, set()).add(suppressed_line)
+    else:
+      if category.startswith('(') and category.endswith(')'):
+        category = category[1:-1]
+        if category in _ERROR_CATEGORIES:
+          _error_suppressions.setdefault(category, set()).add(suppressed_line)
+        elif category not in _LEGACY_ERROR_CATEGORIES:
+          error(filename, linenum, 'readability/nolint', 5,
+                'Unknown NOLINT error category: %s' % category)
+
+
+def ResetNolintSuppressions():
+  """Resets the set of NOLINT suppressions to empty."""
+  _error_suppressions.clear()
+
+
+def IsErrorSuppressedByNolint(category, linenum):
+  """Returns true if the specified error category is suppressed on this line.
+
+  Consults the global error_suppressions map populated by
+  ParseNolintSuppressions/ResetNolintSuppressions.
+
+  Args:
+    category: str, the category of the error.
+    linenum: int, the current line number.
+  Returns:
+    bool, True iff the error should be suppressed due to a NOLINT comment.
+  """
+  return (linenum in _error_suppressions.get(category, set()) or
+          linenum in _error_suppressions.get(None, set()))
+
+
+def Match(pattern, s):
+  """Matches the string with the pattern, caching the compiled regexp."""
+  # The regexp compilation caching is inlined in both Match and Search for
+  # performance reasons; factoring it out into a separate function turns out
+  # to be noticeably expensive.
+  if pattern not in _regexp_compile_cache:
+    _regexp_compile_cache[pattern] = sre_compile.compile(pattern)
+  return _regexp_compile_cache[pattern].match(s)
+
+
+def ReplaceAll(pattern, rep, s):
+  """Replaces instances of pattern in a string with a replacement.
+
+  The compiled regex is kept in a cache shared by Match and Search.
+
+  Args:
+    pattern: regex pattern
+    rep: replacement text
+    s: search string
+
+  Returns:
+    string with replacements made (or original string if no replacements)
+  """
+  if pattern not in _regexp_compile_cache:
+    _regexp_compile_cache[pattern] = sre_compile.compile(pattern)
+  return _regexp_compile_cache[pattern].sub(rep, s)
+
+
+def Search(pattern, s):
+  """Searches the string for the pattern, caching the compiled regexp."""
+  if pattern not in _regexp_compile_cache:
+    _regexp_compile_cache[pattern] = sre_compile.compile(pattern)
+  return _regexp_compile_cache[pattern].search(s)
+
+
+class _IncludeState(object):
+  """Tracks line numbers for includes, and the order in which includes appear.
+
+  include_list contains list of lists of (header, line number) pairs.
+  It's a lists of lists rather than just one flat list to make it
+  easier to update across preprocessor boundaries.
+
+  Call CheckNextIncludeOrder() once for each header in the file, passing
+  in the type constants defined above. Calls in an illegal order will
+  raise an _IncludeError with an appropriate error message.
+
+  """
+  # self._section will move monotonically through this set. If it ever
+  # needs to move backwards, CheckNextIncludeOrder will raise an error.
+  _INITIAL_SECTION = 0
+  _MY_H_SECTION = 1
+  _C_SECTION = 2
+  _CPP_SECTION = 3
+  _OTHER_H_SECTION = 4
+
+  _TYPE_NAMES = {
+      _C_SYS_HEADER: 'C system header',
+      _CPP_SYS_HEADER: 'C++ system header',
+      _LIKELY_MY_HEADER: 'header this file implements',
+      _POSSIBLE_MY_HEADER: 'header this file may implement',
+      _OTHER_HEADER: 'other header',
+      }
+  _SECTION_NAMES = {
+      _INITIAL_SECTION: "... nothing. (This can't be an error.)",
+      _MY_H_SECTION: 'a header this file implements',
+      _C_SECTION: 'C system header',
+      _CPP_SECTION: 'C++ system header',
+      _OTHER_H_SECTION: 'other header',
+      }
+
+  def __init__(self):
+    self.include_list = [[]]
+    self.ResetSection('')
+
+  def FindHeader(self, header):
+    """Check if a header has already been included.
+
+    Args:
+      header: header to check.
+    Returns:
+      Line number of previous occurrence, or -1 if the header has not
+      been seen before.
+    """
+    for section_list in self.include_list:
+      for f in section_list:
+        if f[0] == header:
+          return f[1]
+    return -1
+
+  def ResetSection(self, directive):
+    """Reset section checking for preprocessor directive.
+
+    Args:
+      directive: preprocessor directive (e.g. "if", "else").
+    """
+    # The name of the current section.
+    self._section = self._INITIAL_SECTION
+    # The path of last found header.
+    self._last_header = ''
+
+    # Update list of includes.  Note that we never pop from the
+    # include list.
+    if directive in ('if', 'ifdef', 'ifndef'):
+      self.include_list.append([])
+    elif directive in ('else', 'elif'):
+      self.include_list[-1] = []
+
+  def SetLastHeader(self, header_path):
+    self._last_header = header_path
+
+  def CanonicalizeAlphabeticalOrder(self, header_path):
+    """Returns a path canonicalized for alphabetical comparison.
+
+    - replaces "-" with "_" so they both cmp the same.
+    - removes '-inl' since we don't require them to be after the main header.
+    - lowercase everything, just in case.
+
+    Args:
+      header_path: Path to be canonicalized.
+
+    Returns:
+      Canonicalized path.
+    """
+    return header_path.replace('-inl.h', '.h').replace('-', '_').lower()
+
+  def IsInAlphabeticalOrder(self, clean_lines, linenum, header_path):
+    """Check if a header is in alphabetical order with the previous header.
+
+    Args:
+      clean_lines: A CleansedLines instance containing the file.
+      linenum: The number of the line to check.
+      header_path: Canonicalized header to be checked.
+
+    Returns:
+      Returns true if the header is in alphabetical order.
+    """
+    # If previous section is different from current section, _last_header will
+    # be reset to empty string, so it's always less than current header.
+    #
+    # If previous line was a blank line, assume that the headers are
+    # intentionally sorted the way they are.
+    if (self._last_header > header_path and
+        Match(r'^\s*#\s*include\b', clean_lines.elided[linenum - 1])):
+      return False
+    return True
+
+  def CheckNextIncludeOrder(self, header_type):
+    """Returns a non-empty error message if the next header is out of order.
+
+    This function also updates the internal state to be ready to check
+    the next include.
+
+    Args:
+      header_type: One of the _XXX_HEADER constants defined above.
+
+    Returns:
+      The empty string if the header is in the right order, or an
+      error message describing what's wrong.
+
+    """
+    error_message = ('Found %s after %s' %
+                     (self._TYPE_NAMES[header_type],
+                      self._SECTION_NAMES[self._section]))
+
+    last_section = self._section
+
+    if header_type == _C_SYS_HEADER:
+      if self._section <= self._C_SECTION:
+        self._section = self._C_SECTION
+      else:
+        self._last_header = ''
+        return error_message
+    elif header_type == _CPP_SYS_HEADER:
+      if self._section <= self._CPP_SECTION:
+        self._section = self._CPP_SECTION
+      else:
+        self._last_header = ''
+        return error_message
+    elif header_type == _LIKELY_MY_HEADER:
+      if self._section <= self._MY_H_SECTION:
+        self._section = self._MY_H_SECTION
+      else:
+        self._section = self._OTHER_H_SECTION
+    elif header_type == _POSSIBLE_MY_HEADER:
+      if self._section <= self._MY_H_SECTION:
+        self._section = self._MY_H_SECTION
+      else:
+        # This will always be the fallback because we're not sure
+        # enough that the header is associated with this file.
+        self._section = self._OTHER_H_SECTION
+    else:
+      assert header_type == _OTHER_HEADER
+      self._section = self._OTHER_H_SECTION
+
+    if last_section != self._section:
+      self._last_header = ''
+
+    return ''
+
+
+class _CppLintState(object):
+  """Maintains module-wide state.."""
+
+  def __init__(self):
+    self.verbose_level = 1  # global setting.
+    self.error_count = 0    # global count of reported errors
+    # filters to apply when emitting error messages
+    self.filters = _DEFAULT_FILTERS[:]
+    # backup of filter list. Used to restore the state after each file.
+    self._filters_backup = self.filters[:]
+    self.counting = 'total'  # In what way are we counting errors?
+    self.errors_by_category = {}  # string to int dict storing error counts
+
+    # output format:
+    # "emacs" - format that emacs can parse (default)
+    # "vs7" - format that Microsoft Visual Studio 7 can parse
+    self.output_format = 'emacs'
+
+  def SetOutputFormat(self, output_format):
+    """Sets the output format for errors."""
+    self.output_format = output_format
+
+  def SetVerboseLevel(self, level):
+    """Sets the module's verbosity, and returns the previous setting."""
+    last_verbose_level = self.verbose_level
+    self.verbose_level = level
+    return last_verbose_level
+
+  def SetCountingStyle(self, counting_style):
+    """Sets the module's counting options."""
+    self.counting = counting_style
+
+  def SetFilters(self, filters):
+    """Sets the error-message filters.
+
+    These filters are applied when deciding whether to emit a given
+    error message.
+
+    Args:
+      filters: A string of comma-separated filters (eg "+whitespace/indent").
+               Each filter should start with + or -; else we die.
+
+    Raises:
+      ValueError: The comma-separated filters did not all start with '+' or '-'.
+                  E.g. "-,+whitespace,-whitespace/indent,whitespace/badfilter"
+    """
+    # Default filters always have less priority than the flag ones.
+    self.filters = _DEFAULT_FILTERS[:]
+    self.AddFilters(filters)
+
+  def AddFilters(self, filters):
+    """ Adds more filters to the existing list of error-message filters. """
+    for filt in filters.split(','):
+      clean_filt = filt.strip()
+      if clean_filt:
+        self.filters.append(clean_filt)
+    for filt in self.filters:
+      if not (filt.startswith('+') or filt.startswith('-')):
+        raise ValueError('Every filter in --filters must start with + or -'
+                         ' (%s does not)' % filt)
+
+  def BackupFilters(self):
+    """ Saves the current filter list to backup storage."""
+    self._filters_backup = self.filters[:]
+
+  def RestoreFilters(self):
+    """ Restores filters previously backed up."""
+    self.filters = self._filters_backup[:]
+
+  def ResetErrorCounts(self):
+    """Sets the module's error statistic back to zero."""
+    self.error_count = 0
+    self.errors_by_category = {}
+
+  def IncrementErrorCount(self, category):
+    """Bumps the module's error statistic."""
+    self.error_count += 1
+    if self.counting in ('toplevel', 'detailed'):
+      if self.counting != 'detailed':
+        category = category.split('/')[0]
+      if category not in self.errors_by_category:
+        self.errors_by_category[category] = 0
+      self.errors_by_category[category] += 1
+
+  def PrintErrorCounts(self):
+    """Print a summary of errors by category, and the total."""
+    for category, count in self.errors_by_category.iteritems():
+      sys.stderr.write('Category \'%s\' errors found: %d\n' %
+                       (category, count))
+    sys.stderr.write('Total errors found: %d\n' % self.error_count)
+
+_cpplint_state = _CppLintState()
+
+
+def _OutputFormat():
+  """Gets the module's output format."""
+  return _cpplint_state.output_format
+
+
+def _SetOutputFormat(output_format):
+  """Sets the module's output format."""
+  _cpplint_state.SetOutputFormat(output_format)
+
+
+def _VerboseLevel():
+  """Returns the module's verbosity setting."""
+  return _cpplint_state.verbose_level
+
+
+def _SetVerboseLevel(level):
+  """Sets the module's verbosity, and returns the previous setting."""
+  return _cpplint_state.SetVerboseLevel(level)
+
+
+def _SetCountingStyle(level):
+  """Sets the module's counting options."""
+  _cpplint_state.SetCountingStyle(level)
+
+
+def _Filters():
+  """Returns the module's list of output filters, as a list."""
+  return _cpplint_state.filters
+
+
+def _SetFilters(filters):
+  """Sets the module's error-message filters.
+
+  These filters are applied when deciding whether to emit a given
+  error message.
+
+  Args:
+    filters: A string of comma-separated filters (eg "whitespace/indent").
+             Each filter should start with + or -; else we die.
+  """
+  _cpplint_state.SetFilters(filters)
+
+def _AddFilters(filters):
+  """Adds more filter overrides.
+
+  Unlike _SetFilters, this function does not reset the current list of filters
+  available.
+
+  Args:
+    filters: A string of comma-separated filters (eg "whitespace/indent").
+             Each filter should start with + or -; else we die.
+  """
+  _cpplint_state.AddFilters(filters)
+
+def _BackupFilters():
+  """ Saves the current filter list to backup storage."""
+  _cpplint_state.BackupFilters()
+
+def _RestoreFilters():
+  """ Restores filters previously backed up."""
+  _cpplint_state.RestoreFilters()
+
+class _FunctionState(object):
+  """Tracks current function name and the number of lines in its body."""
+
+  _NORMAL_TRIGGER = 250  # for --v=0, 500 for --v=1, etc.
+  _TEST_TRIGGER = 400    # about 50% more than _NORMAL_TRIGGER.
+
+  def __init__(self):
+    self.in_a_function = False
+    self.lines_in_function = 0
+    self.current_function = ''
+
+  def Begin(self, function_name):
+    """Start analyzing function body.
+
+    Args:
+      function_name: The name of the function being tracked.
+    """
+    self.in_a_function = True
+    self.lines_in_function = 0
+    self.current_function = function_name
+
+  def Count(self):
+    """Count line in current function body."""
+    if self.in_a_function:
+      self.lines_in_function += 1
+
+  def Check(self, error, filename, linenum):
+    """Report if too many lines in function body.
+
+    Args:
+      error: The function to call with any errors found.
+      filename: The name of the current file.
+      linenum: The number of the line to check.
+    """
+    if Match(r'T(EST|est)', self.current_function):
+      base_trigger = self._TEST_TRIGGER
+    else:
+      base_trigger = self._NORMAL_TRIGGER
+    trigger = base_trigger * 2**_VerboseLevel()
+
+    if self.lines_in_function > trigger:
+      error_level = int(math.log(self.lines_in_function / base_trigger, 2))
+      # 50 => 0, 100 => 1, 200 => 2, 400 => 3, 800 => 4, 1600 => 5, ...
+      if error_level > 5:
+        error_level = 5
+      error(filename, linenum, 'readability/fn_size', error_level,
+            'Small and focused functions are preferred:'
+            ' %s has %d non-comment lines'
+            ' (error triggered by exceeding %d lines).'  % (
+                self.current_function, self.lines_in_function, trigger))
+
+  def End(self):
+    """Stop analyzing function body."""
+    self.in_a_function = False
+
+
+class _IncludeError(Exception):
+  """Indicates a problem with the include order in a file."""
+  pass
+
+
+class FileInfo(object):
+  """Provides utility functions for filenames.
+
+  FileInfo provides easy access to the components of a file's path
+  relative to the project root.
+  """
+
+  def __init__(self, filename):
+    self._filename = filename
+
+  def FullName(self):
+    """Make Windows paths like Unix."""
+    return os.path.abspath(self._filename).replace('\\', '/')
+
+  def RepositoryName(self):
+    """FullName after removing the local path to the repository.
+
+    If we have a real absolute path name here we can try to do something smart:
+    detecting the root of the checkout and truncating /path/to/checkout from
+    the name so that we get header guards that don't include things like
+    "C:\Documents and Settings\..." or "/home/username/..." in them and thus
+    people on different computers who have checked the source out to different
+    locations won't see bogus errors.
+    """
+    fullname = self.FullName()
+
+    if os.path.exists(fullname):
+      project_dir = os.path.dirname(fullname)
+
+      if os.path.exists(os.path.join(project_dir, ".svn")):
+        # If there's a .svn file in the current directory, we recursively look
+        # up the directory tree for the top of the SVN checkout
+        root_dir = project_dir
+        one_up_dir = os.path.dirname(root_dir)
+        while os.path.exists(os.path.join(one_up_dir, ".svn")):
+          root_dir = os.path.dirname(root_dir)
+          one_up_dir = os.path.dirname(one_up_dir)
+
+        prefix = os.path.commonprefix([root_dir, project_dir])
+        return fullname[len(prefix) + 1:]
+
+      # Not SVN <= 1.6? Try to find a git, hg, or svn top level directory by
+      # searching up from the current path.
+      root_dir = os.path.dirname(fullname)
+      while (root_dir != os.path.dirname(root_dir) and
+             not os.path.exists(os.path.join(root_dir, ".git")) and
+             not os.path.exists(os.path.join(root_dir, ".hg")) and
+             not os.path.exists(os.path.join(root_dir, ".svn"))):
+        root_dir = os.path.dirname(root_dir)
+
+      if (os.path.exists(os.path.join(root_dir, ".git")) or
+          os.path.exists(os.path.join(root_dir, ".hg")) or
+          os.path.exists(os.path.join(root_dir, ".svn"))):
+        prefix = os.path.commonprefix([root_dir, project_dir])
+        return fullname[len(prefix) + 1:]
+
+    # Don't know what to do; header guard warnings may be wrong...
+    return fullname
+
+  def Split(self):
+    """Splits the file into the directory, basename, and extension.
+
+    For 'chrome/browser/browser.cc', Split() would
+    return ('chrome/browser', 'browser', '.cc')
+
+    Returns:
+      A tuple of (directory, basename, extension).
+    """
+
+    googlename = self.RepositoryName()
+    project, rest = os.path.split(googlename)
+    return (project,) + os.path.splitext(rest)
+
+  def BaseName(self):
+    """File base name - text after the final slash, before the final period."""
+    return self.Split()[1]
+
+  def Extension(self):
+    """File extension - text following the final period."""
+    return self.Split()[2]
+
+  def NoExtension(self):
+    """File has no source file extension."""
+    return '/'.join(self.Split()[0:2])
+
+  def IsSource(self):
+    """File has a source file extension."""
+    return self.Extension()[1:] in ('c', 'cc', 'cpp', 'cxx')
+
+
+def _ShouldPrintError(category, confidence, linenum):
+  """If confidence >= verbose, category passes filter and is not suppressed."""
+
+  # There are three ways we might decide not to print an error message:
+  # a "NOLINT(category)" comment appears in the source,
+  # the verbosity level isn't high enough, or the filters filter it out.
+  if IsErrorSuppressedByNolint(category, linenum):
+    return False
+
+  if confidence < _cpplint_state.verbose_level:
+    return False
+
+  is_filtered = False
+  for one_filter in _Filters():
+    if one_filter.startswith('-'):
+      if category.startswith(one_filter[1:]):
+        is_filtered = True
+    elif one_filter.startswith('+'):
+      if category.startswith(one_filter[1:]):
+        is_filtered = False
+    else:
+      assert False  # should have been checked for in SetFilter.
+  if is_filtered:
+    return False
+
+  return True
+
+
+def Error(filename, linenum, category, confidence, message):
+  """Logs the fact we've found a lint error.
+
+  We log where the error was found, and also our confidence in the error,
+  that is, how certain we are this is a legitimate style regression, and
+  not a misidentification or a use that's sometimes justified.
+
+  False positives can be suppressed by the use of
+  "cpplint(category)"  comments on the offending line.  These are
+  parsed into _error_suppressions.
+
+  Args:
+    filename: The name of the file containing the error.
+    linenum: The number of the line containing the error.
+    category: A string used to describe the "category" this bug
+      falls under: "whitespace", say, or "runtime".  Categories
+      may have a hierarchy separated by slashes: "whitespace/indent".
+    confidence: A number from 1-5 representing a confidence score for
+      the error, with 5 meaning that we are certain of the problem,
+      and 1 meaning that it could be a legitimate construct.
+    message: The error message.
+  """
+  if _ShouldPrintError(category, confidence, linenum):
+    _cpplint_state.IncrementErrorCount(category)
+    if _cpplint_state.output_format == 'vs7':
+      sys.stderr.write('%s(%s):  %s  [%s] [%d]\n' % (
+          filename, linenum, message, category, confidence))
+    elif _cpplint_state.output_format == 'eclipse':
+      sys.stderr.write('%s:%s: warning: %s  [%s] [%d]\n' % (
+          filename, linenum, message, category, confidence))
+    else:
+      sys.stderr.write('%s:%s:  %s  [%s] [%d]\n' % (
+          filename, linenum, message, category, confidence))
+
+
+# Matches standard C++ escape sequences per 2.13.2.3 of the C++ standard.
+_RE_PATTERN_CLEANSE_LINE_ESCAPES = re.compile(
+    r'\\([abfnrtv?"\\\']|\d+|x[0-9a-fA-F]+)')
+# Match a single C style comment on the same line.
+_RE_PATTERN_C_COMMENTS = r'/\*(?:[^*]|\*(?!/))*\*/'
+# Matches multi-line C style comments.
+# This RE is a little bit more complicated than one might expect, because we
+# have to take care of space removals tools so we can handle comments inside
+# statements better.
+# The current rule is: We only clear spaces from both sides when we're at the
+# end of the line. Otherwise, we try to remove spaces from the right side,
+# if this doesn't work we try on left side but only if there's a non-character
+# on the right.
+_RE_PATTERN_CLEANSE_LINE_C_COMMENTS = re.compile(
+    r'(\s*' + _RE_PATTERN_C_COMMENTS + r'\s*$|' +
+    _RE_PATTERN_C_COMMENTS + r'\s+|' +
+    r'\s+' + _RE_PATTERN_C_COMMENTS + r'(?=\W)|' +
+    _RE_PATTERN_C_COMMENTS + r')')
+
+
+def IsCppString(line):
+  """Does line terminate so, that the next symbol is in string constant.
+
+  This function does not consider single-line nor multi-line comments.
+
+  Args:
+    line: is a partial line of code starting from the 0..n.
+
+  Returns:
+    True, if next character appended to 'line' is inside a
+    string constant.
+  """
+
+  line = line.replace(r'\\', 'XX')  # after this, \\" does not match to \"
+  return ((line.count('"') - line.count(r'\"') - line.count("'\"'")) & 1) == 1
+
+
+def CleanseRawStrings(raw_lines):
+  """Removes C++11 raw strings from lines.
+
+    Before:
+      static const char kData[] = R"(
+          multi-line string
+          )";
+
+    After:
+      static const char kData[] = ""
+          (replaced by blank line)
+          "";
+
+  Args:
+    raw_lines: list of raw lines.
+
+  Returns:
+    list of lines with C++11 raw strings replaced by empty strings.
+  """
+
+  delimiter = None
+  lines_without_raw_strings = []
+  for line in raw_lines:
+    if delimiter:
+      # Inside a raw string, look for the end
+      end = line.find(delimiter)
+      if end >= 0:
+        # Found the end of the string, match leading space for this
+        # line and resume copying the original lines, and also insert
+        # a "" on the last line.
+        leading_space = Match(r'^(\s*)\S', line)
+        line = leading_space.group(1) + '""' + line[end + len(delimiter):]
+        delimiter = None
+      else:
+        # Haven't found the end yet, append a blank line.
+        line = '""'
+
+    # Look for beginning of a raw string, and replace them with
+    # empty strings.  This is done in a loop to handle multiple raw
+    # strings on the same line.
+    while delimiter is None:
+      # Look for beginning of a raw string.
+      # See 2.14.15 [lex.string] for syntax.
+      matched = Match(r'^(.*)\b(?:R|u8R|uR|UR|LR)"([^\s\\()]*)\((.*)$', line)
+      if matched:
+        delimiter = ')' + matched.group(2) + '"'
+
+        end = matched.group(3).find(delimiter)
+        if end >= 0:
+          # Raw string ended on same line
+          line = (matched.group(1) + '""' +
+                  matched.group(3)[end + len(delimiter):])
+          delimiter = None
+        else:
+          # Start of a multi-line raw string
+          line = matched.group(1) + '""'
+      else:
+        break
+
+    lines_without_raw_strings.append(line)
+
+  # TODO(unknown): if delimiter is not None here, we might want to
+  # emit a warning for unterminated string.
+  return lines_without_raw_strings
+
+
+def FindNextMultiLineCommentStart(lines, lineix):
+  """Find the beginning marker for a multiline comment."""
+  while lineix < len(lines):
+    if lines[lineix].strip().startswith('/*'):
+      # Only return this marker if the comment goes beyond this line
+      if lines[lineix].strip().find('*/', 2) < 0:
+        return lineix
+    lineix += 1
+  return len(lines)
+
+
+def FindNextMultiLineCommentEnd(lines, lineix):
+  """We are inside a comment, find the end marker."""
+  while lineix < len(lines):
+    if lines[lineix].strip().endswith('*/'):
+      return lineix
+    lineix += 1
+  return len(lines)
+
+
+def RemoveMultiLineCommentsFromRange(lines, begin, end):
+  """Clears a range of lines for multi-line comments."""
+  # Having // dummy comments makes the lines non-empty, so we will not get
+  # unnecessary blank line warnings later in the code.
+  for i in range(begin, end):
+    lines[i] = '/**/'
+
+
+def RemoveMultiLineComments(filename, lines, error):
+  """Removes multiline (c-style) comments from lines."""
+  lineix = 0
+  while lineix < len(lines):
+    lineix_begin = FindNextMultiLineCommentStart(lines, lineix)
+    if lineix_begin >= len(lines):
+      return
+    lineix_end = FindNextMultiLineCommentEnd(lines, lineix_begin)
+    if lineix_end >= len(lines):
+      error(filename, lineix_begin + 1, 'readability/multiline_comment', 5,
+            'Could not find end of multi-line comment')
+      return
+    RemoveMultiLineCommentsFromRange(lines, lineix_begin, lineix_end + 1)
+    lineix = lineix_end + 1
+
+
+def CleanseComments(line):
+  """Removes //-comments and single-line C-style /* */ comments.
+
+  Args:
+    line: A line of C++ source.
+
+  Returns:
+    The line with single-line comments removed.
+  """
+  commentpos = line.find('//')
+  if commentpos != -1 and not IsCppString(line[:commentpos]):
+    line = line[:commentpos].rstrip()
+  # get rid of /* ... */
+  return _RE_PATTERN_CLEANSE_LINE_C_COMMENTS.sub('', line)
+
+
+class CleansedLines(object):
+  """Holds 4 copies of all lines with different preprocessing applied to them.
+
+  1) elided member contains lines without strings and comments.
+  2) lines member contains lines without comments.
+  3) raw_lines member contains all the lines without processing.
+  4) lines_without_raw_strings member is same as raw_lines, but with C++11 raw
+     strings removed.
+  All these members are of <type 'list'>, and of the same length.
+  """
+
+  def __init__(self, lines):
+    self.elided = []
+    self.lines = []
+    self.raw_lines = lines
+    self.num_lines = len(lines)
+    self.lines_without_raw_strings = CleanseRawStrings(lines)
+    for linenum in range(len(self.lines_without_raw_strings)):
+      self.lines.append(CleanseComments(
+          self.lines_without_raw_strings[linenum]))
+      elided = self._CollapseStrings(self.lines_without_raw_strings[linenum])
+      self.elided.append(CleanseComments(elided))
+
+  def NumLines(self):
+    """Returns the number of lines represented."""
+    return self.num_lines
+
+  @staticmethod
+  def _CollapseStrings(elided):
+    """Collapses strings and chars on a line to simple "" or '' blocks.
+
+    We nix strings first so we're not fooled by text like '"http://"'
+
+    Args:
+      elided: The line being processed.
+
+    Returns:
+      The line with collapsed strings.
+    """
+    if _RE_PATTERN_INCLUDE.match(elided):
+      return elided
+
+    # Remove escaped characters first to make quote/single quote collapsing
+    # basic.  Things that look like escaped characters shouldn't occur
+    # outside of strings and chars.
+    elided = _RE_PATTERN_CLEANSE_LINE_ESCAPES.sub('', elided)
+
+    # Replace quoted strings and digit separators.  Both single quotes
+    # and double quotes are processed in the same loop, otherwise
+    # nested quotes wouldn't work.
+    collapsed = ''
+    while True:
+      # Find the first quote character
+      match = Match(r'^([^\'"]*)([\'"])(.*)$', elided)
+      if not match:
+        collapsed += elided
+        break
+      head, quote, tail = match.groups()
+
+      if quote == '"':
+        # Collapse double quoted strings
+        second_quote = tail.find('"')
+        if second_quote >= 0:
+          collapsed += head + '""'
+          elided = tail[second_quote + 1:]
+        else:
+          # Unmatched double quote, don't bother processing the rest
+          # of the line since this is probably a multiline string.
+          collapsed += elided
+          break
+      else:
+        # Found single quote, check nearby text to eliminate digit separators.
+        #
+        # There is no special handling for floating point here, because
+        # the integer/fractional/exponent parts would all be parsed
+        # correctly as long as there are digits on both sides of the
+        # separator.  So we are fine as long as we don't see something
+        # like "0.'3" (gcc 4.9.0 will not allow this literal).
+        if Search(r'\b(?:0[bBxX]?|[1-9])[0-9a-fA-F]*$', head):
+          match_literal = Match(r'^((?:\'?[0-9a-zA-Z_])*)(.*)$', "'" + tail)
+          collapsed += head + match_literal.group(1).replace("'", '')
+          elided = match_literal.group(2)
+        else:
+          second_quote = tail.find('\'')
+          if second_quote >= 0:
+            collapsed += head + "''"
+            elided = tail[second_quote + 1:]
+          else:
+            # Unmatched single quote
+            collapsed += elided
+            break
+
+    return collapsed
+
+
+def FindEndOfExpressionInLine(line, startpos, stack):
+  """Find the position just after the end of current parenthesized expression.
+
+  Args:
+    line: a CleansedLines line.
+    startpos: start searching at this position.
+    stack: nesting stack at startpos.
+
+  Returns:
+    On finding matching end: (index just after matching end, None)
+    On finding an unclosed expression: (-1, None)
+    Otherwise: (-1, new stack at end of this line)
+  """
+  for i in xrange(startpos, len(line)):
+    char = line[i]
+    if char in '([{':
+      # Found start of parenthesized expression, push to expression stack
+      stack.append(char)
+    elif char == '<':
+      # Found potential start of template argument list
+      if i > 0 and line[i - 1] == '<':
+        # Left shift operator
+        if stack and stack[-1] == '<':
+          stack.pop()
+          if not stack:
+            return (-1, None)
+      elif i > 0 and Search(r'\boperator\s*$', line[0:i]):
+        # operator<, don't add to stack
+        continue
+      else:
+        # Tentative start of template argument list
+        stack.append('<')
+    elif char in ')]}':
+      # Found end of parenthesized expression.
+      #
+      # If we are currently expecting a matching '>', the pending '<'
+      # must have been an operator.  Remove them from expression stack.
+      while stack and stack[-1] == '<':
+        stack.pop()
+      if not stack:
+        return (-1, None)
+      if ((stack[-1] == '(' and char == ')') or
+          (stack[-1] == '[' and char == ']') or
+          (stack[-1] == '{' and char == '}')):
+        stack.pop()
+        if not stack:
+          return (i + 1, None)
+      else:
+        # Mismatched parentheses
+        return (-1, None)
+    elif char == '>':
+      # Found potential end of template argument list.
+
+      # Ignore "->" and operator functions
+      if (i > 0 and
+          (line[i - 1] == '-' or Search(r'\boperator\s*$', line[0:i - 1]))):
+        continue
+
+      # Pop the stack if there is a matching '<'.  Otherwise, ignore
+      # this '>' since it must be an operator.
+      if stack:
+        if stack[-1] == '<':
+          stack.pop()
+          if not stack:
+            return (i + 1, None)
+    elif char == ';':
+      # Found something that look like end of statements.  If we are currently
+      # expecting a '>', the matching '<' must have been an operator, since
+      # template argument list should not contain statements.
+      while stack and stack[-1] == '<':
+        stack.pop()
+      if not stack:
+        return (-1, None)
+
+  # Did not find end of expression or unbalanced parentheses on this line
+  return (-1, stack)
+
+
+def CloseExpression(clean_lines, linenum, pos):
+  """If input points to ( or { or [ or <, finds the position that closes it.
+
+  If lines[linenum][pos] points to a '(' or '{' or '[' or '<', finds the
+  linenum/pos that correspond to the closing of the expression.
+
+  TODO(unknown): cpplint spends a fair bit of time matching parentheses.
+  Ideally we would want to index all opening and closing parentheses once
+  and have CloseExpression be just a simple lookup, but due to preprocessor
+  tricks, this is not so easy.
+
+  Args:
+    clean_lines: A CleansedLines instance containing the file.
+    linenum: The number of the line to check.
+    pos: A position on the line.
+
+  Returns:
+    A tuple (line, linenum, pos) pointer *past* the closing brace, or
+    (line, len(lines), -1) if we never find a close.  Note we ignore
+    strings and comments when matching; and the line we return is the
+    'cleansed' line at linenum.
+  """
+
+  line = clean_lines.elided[linenum]
+  if (line[pos] not in '({[<') or Match(r'<[<=]', line[pos:]):
+    return (line, clean_lines.NumLines(), -1)
+
+  # Check first line
+  (end_pos, stack) = FindEndOfExpressionInLine(line, pos, [])
+  if end_pos > -1:
+    return (line, linenum, end_pos)
+
+  # Continue scanning forward
+  while stack and linenum < clean_lines.NumLines() - 1:
+    linenum += 1
+    line = clean_lines.elided[linenum]
+    (end_pos, stack) = FindEndOfExpressionInLine(line, 0, stack)
+    if end_pos > -1:
+      return (line, linenum, end_pos)
+
+  # Did not find end of expression before end of file, give up
+  return (line, clean_lines.NumLines(), -1)
+
+
+def FindStartOfExpressionInLine(line, endpos, stack):
+  """Find position at the matching start of current expression.
+
+  This is almost the reverse of FindEndOfExpressionInLine, but note
+  that the input position and returned position differs by 1.
+
+  Args:
+    line: a CleansedLines line.
+    endpos: start searching at this position.
+    stack: nesting stack at endpos.
+
+  Returns:
+    On finding matching start: (index at matching start, None)
+    On finding an unclosed expression: (-1, None)
+    Otherwise: (-1, new stack at beginning of this line)
+  """
+  i = endpos
+  while i >= 0:
+    char = line[i]
+    if char in ')]}':
+      # Found end of expression, push to expression stack
+      stack.append(char)
+    elif char == '>':
+      # Found potential end of template argument list.
+      #
+      # Ignore it if it's a "->" or ">=" or "operator>"
+      if (i > 0 and
+          (line[i - 1] == '-' or
+           Match(r'\s>=\s', line[i - 1:]) or
+           Search(r'\boperator\s*$', line[0:i]))):
+        i -= 1
+      else:
+        stack.append('>')
+    elif char == '<':
+      # Found potential start of template argument list
+      if i > 0 and line[i - 1] == '<':
+        # Left shift operator
+        i -= 1
+      else:
+        # If there is a matching '>', we can pop the expression stack.
+        # Otherwise, ignore this '<' since it must be an operator.
+        if stack and stack[-1] == '>':
+          stack.pop()
+          if not stack:
+            return (i, None)
+    elif char in '([{':
+      # Found start of expression.
+      #
+      # If there are any unmatched '>' on the stack, they must be
+      # operators.  Remove those.
+      while stack and stack[-1] == '>':
+        stack.pop()
+      if not stack:
+        return (-1, None)
+      if ((char == '(' and stack[-1] == ')') or
+          (char == '[' and stack[-1] == ']') or
+          (char == '{' and stack[-1] == '}')):
+        stack.pop()
+        if not stack:
+          return (i, None)
+      else:
+        # Mismatched parentheses
+        return (-1, None)
+    elif char == ';':
+      # Found something that look like end of statements.  If we are currently
+      # expecting a '<', the matching '>' must have been an operator, since
+      # template argument list should not contain statements.
+      while stack and stack[-1] == '>':
+        stack.pop()
+      if not stack:
+        return (-1, None)
+
+    i -= 1
+
+  return (-1, stack)
+
+
+def ReverseCloseExpression(clean_lines, linenum, pos):
+  """If input points to ) or } or ] or >, finds the position that opens it.
+
+  If lines[linenum][pos] points to a ')' or '}' or ']' or '>', finds the
+  linenum/pos that correspond to the opening of the expression.
+
+  Args:
+    clean_lines: A CleansedLines instance containing the file.
+    linenum: The number of the line to check.
+    pos: A position on the line.
+
+  Returns:
+    A tuple (line, linenum, pos) pointer *at* the opening brace, or
+    (line, 0, -1) if we never find the matching opening brace.  Note
+    we ignore strings and comments when matching; and the line we
+    return is the 'cleansed' line at linenum.
+  """
+  line = clean_lines.elided[linenum]
+  if line[pos] not in ')}]>':
+    return (line, 0, -1)
+
+  # Check last line
+  (start_pos, stack) = FindStartOfExpressionInLine(line, pos, [])
+  if start_pos > -1:
+    return (line, linenum, start_pos)
+
+  # Continue scanning backward
+  while stack and linenum > 0:
+    linenum -= 1
+    line = clean_lines.elided[linenum]
+    (start_pos, stack) = FindStartOfExpressionInLine(line, len(line) - 1, stack)
+    if start_pos > -1:
+      return (line, linenum, start_pos)
+
+  # Did not find start of expression before beginning of file, give up
+  return (line, 0, -1)
+
+
+def CheckForCopyright(filename, lines, error):
+  """Logs an error if no Copyright message appears at the top of the file."""
+
+  # We'll say it should occur by line 10. Don't forget there's a
+  # dummy line at the front.
+  for line in xrange(1, min(len(lines), 11)):
+    if re.search(r'Copyright', lines[line], re.I): break
+  else:                       # means no copyright line was found
+    error(filename, 0, 'legal/copyright', 5,
+          'No copyright message found.  '
+          'You should have a line: "Copyright [year] <Copyright Owner>"')
+
+
+def GetIndentLevel(line):
+  """Return the number of leading spaces in line.
+
+  Args:
+    line: A string to check.
+
+  Returns:
+    An integer count of leading spaces, possibly zero.
+  """
+  indent = Match(r'^( *)\S', line)
+  if indent:
+    return len(indent.group(1))
+  else:
+    return 0
+
+
+def GetHeaderGuardCPPVariable(filename):
+  """Returns the CPP variable that should be used as a header guard.
+
+  Args:
+    filename: The name of a C++ header file.
+
+  Returns:
+    The CPP variable that should be used as a header guard in the
+    named file.
+
+  """
+
+  # Restores original filename in case that cpplint is invoked from Emacs's
+  # flymake.
+  filename = re.sub(r'_flymake\.h$', '.h', filename)
+  filename = re.sub(r'/\.flymake/([^/]*)$', r'/\1', filename)
+  # Replace 'c++' with 'cpp'.
+  filename = filename.replace('C++', 'cpp').replace('c++', 'cpp')
+  
+  fileinfo = FileInfo(filename)
+  file_path_from_root = fileinfo.RepositoryName()
+  if _root:
+    file_path_from_root = re.sub('^' + _root + os.sep, '', file_path_from_root)
+  return re.sub(r'[^a-zA-Z0-9]', '_', file_path_from_root).upper() + '_'
+
+
+def CheckForHeaderGuard(filename, clean_lines, error):
+  """Checks that the file contains a header guard.
+
+  Logs an error if no #ifndef header guard is present.  For other
+  headers, checks that the full pathname is used.
+
+  Args:
+    filename: The name of the C++ header file.
+    clean_lines: A CleansedLines instance containing the file.
+    error: The function to call with any errors found.
+  """
+
+  # Don't check for header guards if there are error suppression
+  # comments somewhere in this file.
+  #
+  # Because this is silencing a warning for a nonexistent line, we
+  # only support the very specific NOLINT(build/header_guard) syntax,
+  # and not the general NOLINT or NOLINT(*) syntax.
+  raw_lines = clean_lines.lines_without_raw_strings
+  for i in raw_lines:
+    if Search(r'//\s*NOLINT\(build/header_guard\)', i):
+      return
+
+  cppvar = GetHeaderGuardCPPVariable(filename)
+
+  ifndef = ''
+  ifndef_linenum = 0
+  define = ''
+  endif = ''
+  endif_linenum = 0
+  for linenum, line in enumerate(raw_lines):
+    linesplit = line.split()
+    if len(linesplit) >= 2:
+      # find the first occurrence of #ifndef and #define, save arg
+      if not ifndef and linesplit[0] == '#ifndef':
+        # set ifndef to the header guard presented on the #ifndef line.
+        ifndef = linesplit[1]
+        ifndef_linenum = linenum
+      if not define and linesplit[0] == '#define':
+        define = linesplit[1]
+    # find the last occurrence of #endif, save entire line
+    if line.startswith('#endif'):
+      endif = line
+      endif_linenum = linenum
+
+  if not ifndef or not define or ifndef != define:
+    error(filename, 0, 'build/header_guard', 5,
+          'No #ifndef header guard found, suggested CPP variable is: %s' %
+          cppvar)
+    return
+
+  # The guard should be PATH_FILE_H_, but we also allow PATH_FILE_H__
+  # for backward compatibility.
+  if ifndef != cppvar:
+    error_level = 0
+    if ifndef != cppvar + '_':
+      error_level = 5
+
+    ParseNolintSuppressions(filename, raw_lines[ifndef_linenum], ifndef_linenum,
+                            error)
+    error(filename, ifndef_linenum, 'build/header_guard', error_level,
+          '#ifndef header guard has wrong style, please use: %s' % cppvar)
+
+  # Check for "//" comments on endif line.
+  ParseNolintSuppressions(filename, raw_lines[endif_linenum], endif_linenum,
+                          error)
+  match = Match(r'#endif\s*//\s*' + cppvar + r'(_)?\b', endif)
+  if match:
+    if match.group(1) == '_':
+      # Issue low severity warning for deprecated double trailing underscore
+      error(filename, endif_linenum, 'build/header_guard', 0,
+            '#endif line should be "#endif  // %s"' % cppvar)
+    return
+
+  # Didn't find the corresponding "//" comment.  If this file does not
+  # contain any "//" comments at all, it could be that the compiler
+  # only wants "/**/" comments, look for those instead.
+  no_single_line_comments = True
+  for i in xrange(1, len(raw_lines) - 1):
+    line = raw_lines[i]
+    if Match(r'^(?:(?:\'(?:\.|[^\'])*\')|(?:"(?:\.|[^"])*")|[^\'"])*//', line):
+      no_single_line_comments = False
+      break
+
+  if no_single_line_comments:
+    match = Match(r'#endif\s*/\*\s*' + cppvar + r'(_)?\s*\*/', endif)
+    if match:
+      if match.group(1) == '_':
+        # Low severity warning for double trailing underscore
+        error(filename, endif_linenum, 'build/header_guard', 0,
+              '#endif line should be "#endif  /* %s */"' % cppvar)
+      return
+
+  # Didn't find anything
+  error(filename, endif_linenum, 'build/header_guard', 5,
+        '#endif line should be "#endif  // %s"' % cppvar)
+
+
+def CheckHeaderFileIncluded(filename, include_state, error):
+  """Logs an error if a .cc file does not include its header."""
+
+  # Do not check test files
+  if filename.endswith('_test.cc') or filename.endswith('_unittest.cc'):
+    return
+
+  fileinfo = FileInfo(filename)
+  headerfile = filename[0:len(filename) - 2] + 'h'
+  if not os.path.exists(headerfile):
+    return
+  headername = FileInfo(headerfile).RepositoryName()
+  first_include = 0
+  for section_list in include_state.include_list:
+    for f in section_list:
+      if headername in f[0] or f[0] in headername:
+        return
+      if not first_include:
+        first_include = f[1]
+
+  error(filename, first_include, 'build/include', 5,
+        '%s should include its header file %s' % (fileinfo.RepositoryName(),
+                                                  headername))
+
+
+def CheckForBadCharacters(filename, lines, error):
+  """Logs an error for each line containing bad characters.
+
+  Two kinds of bad characters:
+
+  1. Unicode replacement characters: These indicate that either the file
+  contained invalid UTF-8 (likely) or Unicode replacement characters (which
+  it shouldn't).  Note that it's possible for this to throw off line
+  numbering if the invalid UTF-8 occurred adjacent to a newline.
+
+  2. NUL bytes.  These are problematic for some tools.
+
+  Args:
+    filename: The name of the current file.
+    lines: An array of strings, each representing a line of the file.
+    error: The function to call with any errors found.
+  """
+  for linenum, line in enumerate(lines):
+    if u'\ufffd' in line:
+      error(filename, linenum, 'readability/utf8', 5,
+            'Line contains invalid UTF-8 (or Unicode replacement character).')
+    if '\0' in line:
+      error(filename, linenum, 'readability/nul', 5, 'Line contains NUL byte.')
+
+
+def CheckForNewlineAtEOF(filename, lines, error):
+  """Logs an error if there is no newline char at the end of the file.
+
+  Args:
+    filename: The name of the current file.
+    lines: An array of strings, each representing a line of the file.
+    error: The function to call with any errors found.
+  """
+
+  # The array lines() was created by adding two newlines to the
+  # original file (go figure), then splitting on \n.
+  # To verify that the file ends in \n, we just have to make sure the
+  # last-but-two element of lines() exists and is empty.
+  if len(lines) < 3 or lines[-2]:
+    error(filename, len(lines) - 2, 'whitespace/ending_newline', 5,
+          'Could not find a newline character at the end of the file.')
+
+
+def CheckForMultilineCommentsAndStrings(filename, clean_lines, linenum, error):
+  """Logs an error if we see /* ... */ or "..." that extend past one line.
+
+  /* ... */ comments are legit inside macros, for one line.
+  Otherwise, we prefer // comments, so it's ok to warn about the
+  other.  Likewise, it's ok for strings to extend across multiple
+  lines, as long as a line continuation character (backslash)
+  terminates each line. Although not currently prohibited by the C++
+  style guide, it's ugly and unnecessary. We don't do well with either
+  in this lint program, so we warn about both.
+
+  Args:
+    filename: The name of the current file.
+    clean_lines: A CleansedLines instance containing the file.
+    linenum: The number of the line to check.
+    error: The function to call with any errors found.
+  """
+  line = clean_lines.elided[linenum]
+
+  # Remove all \\ (escaped backslashes) from the line. They are OK, and the
+  # second (escaped) slash may trigger later \" detection erroneously.
+  line = line.replace('\\\\', '')
+
+  if line.count('/*') > line.count('*/'):
+    error(filename, linenum, 'readability/multiline_comment', 5,
+          'Complex multi-line /*...*/-style comment found. '
+          'Lint may give bogus warnings.  '
+          'Consider replacing these with //-style comments, '
+          'with #if 0...#endif, '
+          'or with more clearly structured multi-line comments.')
+
+  if (line.count('"') - line.count('\\"')) % 2:
+    error(filename, linenum, 'readability/multiline_string', 5,
+          'Multi-line string ("...") found.  This lint script doesn\'t '
+          'do well with such strings, and may give bogus warnings.  '
+          'Use C++11 raw strings or concatenation instead.')
+
+
+# (non-threadsafe name, thread-safe alternative, validation pattern)
+#
+# The validation pattern is used to eliminate false positives such as:
+#  _rand();               // false positive due to substring match.
+#  ->rand();              // some member function rand().
+#  ACMRandom rand(seed);  // some variable named rand.
+#  ISAACRandom rand();    // another variable named rand.
+#
+# Basically we require the return value of these functions to be used
+# in some expression context on the same line by matching on some
+# operator before the function name.  This eliminates constructors and
+# member function calls.
+_UNSAFE_FUNC_PREFIX = r'(?:[-+*/=%^&|(<]\s*|>\s+)'
+_THREADING_LIST = (
+    ('asctime(', 'asctime_r(', _UNSAFE_FUNC_PREFIX + r'asctime\([^)]+\)'),
+    ('ctime(', 'ctime_r(', _UNSAFE_FUNC_PREFIX + r'ctime\([^)]+\)'),
+    ('getgrgid(', 'getgrgid_r(', _UNSAFE_FUNC_PREFIX + r'getgrgid\([^)]+\)'),
+    ('getgrnam(', 'getgrnam_r(', _UNSAFE_FUNC_PREFIX + r'getgrnam\([^)]+\)'),
+    ('getlogin(', 'getlogin_r(', _UNSAFE_FUNC_PREFIX + r'getlogin\(\)'),
+    ('getpwnam(', 'getpwnam_r(', _UNSAFE_FUNC_PREFIX + r'getpwnam\([^)]+\)'),
+    ('getpwuid(', 'getpwuid_r(', _UNSAFE_FUNC_PREFIX + r'getpwuid\([^)]+\)'),
+    ('gmtime(', 'gmtime_r(', _UNSAFE_FUNC_PREFIX + r'gmtime\([^)]+\)'),
+    ('localtime(', 'localtime_r(', _UNSAFE_FUNC_PREFIX + r'localtime\([^)]+\)'),
+    ('rand(', 'rand_r(', _UNSAFE_FUNC_PREFIX + r'rand\(\)'),
+    ('strtok(', 'strtok_r(',
+     _UNSAFE_FUNC_PREFIX + r'strtok\([^)]+\)'),
+    ('ttyname(', 'ttyname_r(', _UNSAFE_FUNC_PREFIX + r'ttyname\([^)]+\)'),
+    )
+
+
+def CheckPosixThreading(filename, clean_lines, linenum, error):
+  """Checks for calls to thread-unsafe functions.
+
+  Much code has been originally written without consideration of
+  multi-threading. Also, engineers are relying on their old experience;
+  they have learned posix before threading extensions were added. These
+  tests guide the engineers to use thread-safe functions (when using
+  posix directly).
+
+  Args:
+    filename: The name of the current file.
+    clean_lines: A CleansedLines instance containing the file.
+    linenum: The number of the line to check.
+    error: The function to call with any errors found.
+  """
+  line = clean_lines.elided[linenum]
+  for single_thread_func, multithread_safe_func, pattern in _THREADING_LIST:
+    # Additional pattern matching check to confirm that this is the
+    # function we are looking for
+    if Search(pattern, line):
+      error(filename, linenum, 'runtime/threadsafe_fn', 2,
+            'Consider using ' + multithread_safe_func +
+            '...) instead of ' + single_thread_func +
+            '...) for improved thread safety.')
+
+
+def CheckVlogArguments(filename, clean_lines, linenum, error):
+  """Checks that VLOG() is only used for defining a logging level.
+
+  For example, VLOG(2) is correct. VLOG(INFO), VLOG(WARNING), VLOG(ERROR), and
+  VLOG(FATAL) are not.
+
+  Args:
+    filename: The name of the current file.
+    clean_lines: A CleansedLines instance containing the file.
+    linenum: The number of the line to check.
+    error: The function to call with any errors found.
+  """
+  line = clean_lines.elided[linenum]
+  if Search(r'\bVLOG\((INFO|ERROR|WARNING|DFATAL|FATAL)\)', line):
+    error(filename, linenum, 'runtime/vlog', 5,
+          'VLOG() should be used with numeric verbosity level.  '
+          'Use LOG() if you want symbolic severity levels.')
+
+# Matches invalid increment: *count++, which moves pointer instead of
+# incrementing a value.
+_RE_PATTERN_INVALID_INCREMENT = re.compile(
+    r'^\s*\*\w+(\+\+|--);')
+
+
+def CheckInvalidIncrement(filename, clean_lines, linenum, error):
+  """Checks for invalid increment *count++.
+
+  For example following function:
+  void increment_counter(int* count) {
+    *count++;
+  }
+  is invalid, because it effectively does count++, moving pointer, and should
+  be replaced with ++*count, (*count)++ or *count += 1.
+
+  Args:
+    filename: The name of the current file.
+    clean_lines: A CleansedLines instance containing the file.
+    linenum: The number of the line to check.
+    error: The function to call with any errors found.
+  """
+  line = clean_lines.elided[linenum]
+  if _RE_PATTERN_INVALID_INCREMENT.match(line):
+    error(filename, linenum, 'runtime/invalid_increment', 5,
+          'Changing pointer instead of value (or unused value of operator*).')
+
+
+def IsMacroDefinition(clean_lines, linenum):
+  if Search(r'^#define', clean_lines[linenum]):
+    return True
+
+  if linenum > 0 and Search(r'\\$', clean_lines[linenum - 1]):
+    return True
+
+  return False
+
+
+def IsForwardClassDeclaration(clean_lines, linenum):
+  return Match(r'^\s*(\btemplate\b)*.*class\s+\w+;\s*$', clean_lines[linenum])
+
+
+class _BlockInfo(object):
+  """Stores information about a generic block of code."""
+
+  def __init__(self, seen_open_brace):
+    self.seen_open_brace = seen_open_brace
+    self.open_parentheses = 0
+    self.inline_asm = _NO_ASM
+    self.check_namespace_indentation = False
+
+  def CheckBegin(self, filename, clean_lines, linenum, error):
+    """Run checks that applies to text up to the opening brace.
+
+    This is mostly for checking the text after the class identifier
+    and the "{", usually where the base class is specified.  For other
+    blocks, there isn't much to check, so we always pass.
+
+    Args:
+      filename: The name of the current file.
+      clean_lines: A CleansedLines instance containing the file.
+      linenum: The number of the line to check.
+      error: The function to call with any errors found.
+    """
+    pass
+
+  def CheckEnd(self, filename, clean_lines, linenum, error):
+    """Run checks that applies to text after the closing brace.
+
+    This is mostly used for checking end of namespace comments.
+
+    Args:
+      filename: The name of the current file.
+      clean_lines: A CleansedLines instance containing the file.
+      linenum: The number of the line to check.
+      error: The function to call with any errors found.
+    """
+    pass
+
+  def IsBlockInfo(self):
+    """Returns true if this block is a _BlockInfo.
+
+    This is convenient for verifying that an object is an instance of
+    a _BlockInfo, but not an instance of any of the derived classes.
+
+    Returns:
+      True for this class, False for derived classes.
+    """
+    return self.__class__ == _BlockInfo
+
+
+class _ExternCInfo(_BlockInfo):
+  """Stores information about an 'extern "C"' block."""
+
+  def __init__(self):
+    _BlockInfo.__init__(self, True)
+
+
+class _ClassInfo(_BlockInfo):
+  """Stores information about a class."""
+
+  def __init__(self, name, class_or_struct, clean_lines, linenum):
+    _BlockInfo.__init__(self, False)
+    self.name = name
+    self.starting_linenum = linenum
+    self.is_derived = False
+    self.check_namespace_indentation = True
+    if class_or_struct == 'struct':
+      self.access = 'public'
+      self.is_struct = True
+    else:
+      self.access = 'private'
+      self.is_struct = False
+
+    # Remember initial indentation level for this class.  Using raw_lines here
+    # instead of elided to account for leading comments.
+    self.class_indent = GetIndentLevel(clean_lines.raw_lines[linenum])
+
+    # Try to find the end of the class.  This will be confused by things like:
+    #   class A {
+    #   } *x = { ...
+    #
+    # But it's still good enough for CheckSectionSpacing.
+    self.last_line = 0
+    depth = 0
+    for i in range(linenum, clean_lines.NumLines()):
+      line = clean_lines.elided[i]
+      depth += line.count('{') - line.count('}')
+      if not depth:
+        self.last_line = i
+        break
+
+  def CheckBegin(self, filename, clean_lines, linenum, error):
+    # Look for a bare ':'
+    if Search('(^|[^:]):($|[^:])', clean_lines.elided[linenum]):
+      self.is_derived = True
+
+  def CheckEnd(self, filename, clean_lines, linenum, error):
+    # If there is a DISALLOW macro, it should appear near the end of
+    # the class.
+    seen_last_thing_in_class = False
+    for i in xrange(linenum - 1, self.starting_linenum, -1):
+      match = Search(
+          r'\b(DISALLOW_COPY_AND_ASSIGN|DISALLOW_IMPLICIT_CONSTRUCTORS)\(' +
+          self.name + r'\)',
+          clean_lines.elided[i])
+      if match:
+        if seen_last_thing_in_class:
+          error(filename, i, 'readability/constructors', 3,
+                match.group(1) + ' should be the last thing in the class')
+        break
+
+      if not Match(r'^\s*$', clean_lines.elided[i]):
+        seen_last_thing_in_class = True
+
+    # Check that closing brace is aligned with beginning of the class.
+    # Only do this if the closing brace is indented by only whitespaces.
+    # This means we will not check single-line class definitions.
+    indent = Match(r'^( *)\}', clean_lines.elided[linenum])
+    if indent and len(indent.group(1)) != self.class_indent:
+      if self.is_struct:
+        parent = 'struct ' + self.name
+      else:
+        parent = 'class ' + self.name
+      error(filename, linenum, 'whitespace/indent', 3,
+            'Closing brace should be aligned with beginning of %s' % parent)
+
+
+class _NamespaceInfo(_BlockInfo):
+  """Stores information about a namespace."""
+
+  def __init__(self, name, linenum):
+    _BlockInfo.__init__(self, False)
+    self.name = name or ''
+    self.starting_linenum = linenum
+    self.check_namespace_indentation = True
+
+  def CheckEnd(self, filename, clean_lines, linenum, error):
+    """Check end of namespace comments."""
+    line = clean_lines.raw_lines[linenum]
+
+    # Check how many lines is enclosed in this namespace.  Don't issue
+    # warning for missing namespace comments if there aren't enough
+    # lines.  However, do apply checks if there is already an end of
+    # namespace comment and it's incorrect.
+    #
+    # TODO(unknown): We always want to check end of namespace comments
+    # if a namespace is large, but sometimes we also want to apply the
+    # check if a short namespace contained nontrivial things (something
+    # other than forward declarations).  There is currently no logic on
+    # deciding what these nontrivial things are, so this check is
+    # triggered by namespace size only, which works most of the time.
+    if (linenum - self.starting_linenum < 10
+        and not Match(r'};*\s*(//|/\*).*\bnamespace\b', line)):
+      return
+
+    # Look for matching comment at end of namespace.
+    #
+    # Note that we accept C style "/* */" comments for terminating
+    # namespaces, so that code that terminate namespaces inside
+    # preprocessor macros can be cpplint clean.
+    #
+    # We also accept stuff like "// end of namespace <name>." with the
+    # period at the end.
+    #
+    # Besides these, we don't accept anything else, otherwise we might
+    # get false negatives when existing comment is a substring of the
+    # expected namespace.
+    if self.name:
+      # Named namespace
+      if not Match((r'};*\s*(//|/\*).*\bnamespace\s+' + re.escape(self.name) +
+                    r'[\*/\.\\\s]*$'),
+                   line):
+        error(filename, linenum, 'readability/namespace', 5,
+              'Namespace should be terminated with "// namespace %s"' %
+              self.name)
+    else:
+      # Anonymous namespace
+      if not Match(r'};*\s*(//|/\*).*\bnamespace[\*/\.\\\s]*$', line):
+        # If "// namespace anonymous" or "// anonymous namespace (more text)",
+        # mention "// anonymous namespace" as an acceptable form
+        if Match(r'}.*\b(namespace anonymous|anonymous namespace)\b', line):
+          error(filename, linenum, 'readability/namespace', 5,
+                'Anonymous namespace should be terminated with "// namespace"'
+                ' or "// anonymous namespace"')
+        else:
+          error(filename, linenum, 'readability/namespace', 5,
+                'Anonymous namespace should be terminated with "// namespace"')
+
+
+class _PreprocessorInfo(object):
+  """Stores checkpoints of nesting stacks when #if/#else is seen."""
+
+  def __init__(self, stack_before_if):
+    # The entire nesting stack before #if
+    self.stack_before_if = stack_before_if
+
+    # The entire nesting stack up to #else
+    self.stack_before_else = []
+
+    # Whether we have already seen #else or #elif
+    self.seen_else = False
+
+
+class NestingState(object):
+  """Holds states related to parsing braces."""
+
+  def __init__(self):
+    # Stack for tracking all braces.  An object is pushed whenever we
+    # see a "{", and popped when we see a "}".  Only 3 types of
+    # objects are possible:
+    # - _ClassInfo: a class or struct.
+    # - _NamespaceInfo: a namespace.
+    # - _BlockInfo: some other type of block.
+    self.stack = []
+
+    # Top of the previous stack before each Update().
+    #
+    # Because the nesting_stack is updated at the end of each line, we
+    # had to do some convoluted checks to find out what is the current
+    # scope at the beginning of the line.  This check is simplified by
+    # saving the previous top of nesting stack.
+    #
+    # We could save the full stack, but we only need the top.  Copying
+    # the full nesting stack would slow down cpplint by ~10%.
+    self.previous_stack_top = []
+
+    # Stack of _PreprocessorInfo objects.
+    self.pp_stack = []
+
+  def SeenOpenBrace(self):
+    """Check if we have seen the opening brace for the innermost block.
+
+    Returns:
+      True if we have seen the opening brace, False if the innermost
+      block is still expecting an opening brace.
+    """
+    return (not self.stack) or self.stack[-1].seen_open_brace
+
+  def InNamespaceBody(self):
+    """Check if we are currently one level inside a namespace body.
+
+    Returns:
+      True if top of the stack is a namespace block, False otherwise.
+    """
+    return self.stack and isinstance(self.stack[-1], _NamespaceInfo)
+
+  def InExternC(self):
+    """Check if we are currently one level inside an 'extern "C"' block.
+
+    Returns:
+      True if top of the stack is an extern block, False otherwise.
+    """
+    return self.stack and isinstance(self.stack[-1], _ExternCInfo)
+
+  def InClassDeclaration(self):
+    """Check if we are currently one level inside a class or struct declaration.
+
+    Returns:
+      True if top of the stack is a class/struct, False otherwise.
+    """
+    return self.stack and isinstance(self.stack[-1], _ClassInfo)
+
+  def InAsmBlock(self):
+    """Check if we are currently one level inside an inline ASM block.
+
+    Returns:
+      True if the top of the stack is a block containing inline ASM.
+    """
+    return self.stack and self.stack[-1].inline_asm != _NO_ASM
+
+  def InTemplateArgumentList(self, clean_lines, linenum, pos):
+    """Check if current position is inside template argument list.
+
+    Args:
+      clean_lines: A CleansedLines instance containing the file.
+      linenum: The number of the line to check.
+      pos: position just after the suspected template argument.
+    Returns:
+      True if (linenum, pos) is inside template arguments.
+    """
+    while linenum < clean_lines.NumLines():
+      # Find the earliest character that might indicate a template argument
+      line = clean_lines.elided[linenum]
+      match = Match(r'^[^{};=\[\]\.<>]*(.)', line[pos:])
+      if not match:
+        linenum += 1
+        pos = 0
+        continue
+      token = match.group(1)
+      pos += len(match.group(0))
+
+      # These things do not look like template argument list:
+      #   class Suspect {
+      #   class Suspect x; }
+      if token in ('{', '}', ';'): return False
+
+      # These things look like template argument list:
+      #   template <class Suspect>
+      #   template <class Suspect = default_value>
+      #   template <class Suspect[]>
+      #   template <class Suspect...>
+      if token in ('>', '=', '[', ']', '.'): return True
+
+      # Check if token is an unmatched '<'.
+      # If not, move on to the next character.
+      if token != '<':
+        pos += 1
+        if pos >= len(line):
+          linenum += 1
+          pos = 0
+        continue
+
+      # We can't be sure if we just find a single '<', and need to
+      # find the matching '>'.
+      (_, end_line, end_pos) = CloseExpression(clean_lines, linenum, pos - 1)
+      if end_pos < 0:
+        # Not sure if template argument list or syntax error in file
+        return False
+      linenum = end_line
+      pos = end_pos
+    return False
+
+  def UpdatePreprocessor(self, line):
+    """Update preprocessor stack.
+
+    We need to handle preprocessors due to classes like this:
+      #ifdef SWIG
+      struct ResultDetailsPageElementExtensionPoint {
+      #else
+      struct ResultDetailsPageElementExtensionPoint : public Extension {
+      #endif
+
+    We make the following assumptions (good enough for most files):
+    - Preprocessor condition evaluates to true from #if up to first
+      #else/#elif/#endif.
+
+    - Preprocessor condition evaluates to false from #else/#elif up
+      to #endif.  We still perform lint checks on these lines, but
+      these do not affect nesting stack.
+
+    Args:
+      line: current line to check.
+    """
+    if Match(r'^\s*#\s*(if|ifdef|ifndef)\b', line):
+      # Beginning of #if block, save the nesting stack here.  The saved
+      # stack will allow us to restore the parsing state in the #else case.
+      self.pp_stack.append(_PreprocessorInfo(copy.deepcopy(self.stack)))
+    elif Match(r'^\s*#\s*(else|elif)\b', line):
+      # Beginning of #else block
+      if self.pp_stack:
+        if not self.pp_stack[-1].seen_else:
+          # This is the first #else or #elif block.  Remember the
+          # whole nesting stack up to this point.  This is what we
+          # keep after the #endif.
+          self.pp_stack[-1].seen_else = True
+          self.pp_stack[-1].stack_before_else = copy.deepcopy(self.stack)
+
+        # Restore the stack to how it was before the #if
+        self.stack = copy.deepcopy(self.pp_stack[-1].stack_before_if)
+      else:
+        # TODO(unknown): unexpected #else, issue warning?
+        pass
+    elif Match(r'^\s*#\s*endif\b', line):
+      # End of #if or #else blocks.
+      if self.pp_stack:
+        # If we saw an #else, we will need to restore the nesting
+        # stack to its former state before the #else, otherwise we
+        # will just continue from where we left off.
+        if self.pp_stack[-1].seen_else:
+          # Here we can just use a shallow copy since we are the last
+          # reference to it.
+          self.stack = self.pp_stack[-1].stack_before_else
+        # Drop the corresponding #if
+        self.pp_stack.pop()
+      else:
+        # TODO(unknown): unexpected #endif, issue warning?
+        pass
+
+  # TODO(unknown): Update() is too long, but we will refactor later.
+  def Update(self, filename, clean_lines, linenum, error):
+    """Update nesting state with current line.
+
+    Args:
+      filename: The name of the current file.
+      clean_lines: A CleansedLines instance containing the file.
+      linenum: The number of the line to check.
+      error: The function to call with any errors found.
+    """
+    line = clean_lines.elided[linenum]
+
+    # Remember top of the previous nesting stack.
+    #
+    # The stack is always pushed/popped and not modified in place, so
+    # we can just do a shallow copy instead of copy.deepcopy.  Using
+    # deepcopy would slow down cpplint by ~28%.
+    if self.stack:
+      self.previous_stack_top = self.stack[-1]
+    else:
+      self.previous_stack_top = None
+
+    # Update pp_stack
+    self.UpdatePreprocessor(line)
+
+    # Count parentheses.  This is to avoid adding struct arguments to
+    # the nesting stack.
+    if self.stack:
+      inner_block = self.stack[-1]
+      depth_change = line.count('(') - line.count(')')
+      inner_block.open_parentheses += depth_change
+
+      # Also check if we are starting or ending an inline assembly block.
+      if inner_block.inline_asm in (_NO_ASM, _END_ASM):
+        if (depth_change != 0 and
+            inner_block.open_parentheses == 1 and
+            _MATCH_ASM.match(line)):
+          # Enter assembly block
+          inner_block.inline_asm = _INSIDE_ASM
+        else:
+          # Not entering assembly block.  If previous line was _END_ASM,
+          # we will now shift to _NO_ASM state.
+          inner_block.inline_asm = _NO_ASM
+      elif (inner_block.inline_asm == _INSIDE_ASM and
+            inner_block.open_parentheses == 0):
+        # Exit assembly block
+        inner_block.inline_asm = _END_ASM
+
+    # Consume namespace declaration at the beginning of the line.  Do
+    # this in a loop so that we catch same line declarations like this:
+    #   namespace proto2 { namespace bridge { class MessageSet; } }
+    while True:
+      # Match start of namespace.  The "\b\s*" below catches namespace
+      # declarations even if it weren't followed by a whitespace, this
+      # is so that we don't confuse our namespace checker.  The
+      # missing spaces will be flagged by CheckSpacing.
+      namespace_decl_match = Match(r'^\s*namespace\b\s*([:\w]+)?(.*)$', line)
+      if not namespace_decl_match:
+        break
+
+      new_namespace = _NamespaceInfo(namespace_decl_match.group(1), linenum)
+      self.stack.append(new_namespace)
+
+      line = namespace_decl_match.group(2)
+      if line.find('{') != -1:
+        new_namespace.seen_open_brace = True
+        line = line[line.find('{') + 1:]
+
+    # Look for a class declaration in whatever is left of the line
+    # after parsing namespaces.  The regexp accounts for decorated classes
+    # such as in:
+    #   class LOCKABLE API Object {
+    #   };
+    class_decl_match = Match(
+        r'^(\s*(?:template\s*<[\w\s<>,:]*>\s*)?'
+        r'(class|struct)\s+(?:[A-Z_]+\s+)*(\w+(?:::\w+)*))'
+        r'(.*)$', line)
+    if (class_decl_match and
+        (not self.stack or self.stack[-1].open_parentheses == 0)):
+      # We do not want to accept classes that are actually template arguments:
+      #   template <class Ignore1,
+      #             class Ignore2 = Default<Args>,
+      #             template <Args> class Ignore3>
+      #   void Function() {};
+      #
+      # To avoid template argument cases, we scan forward and look for
+      # an unmatched '>'.  If we see one, assume we are inside a
+      # template argument list.
+      end_declaration = len(class_decl_match.group(1))
+      if not self.InTemplateArgumentList(clean_lines, linenum, end_declaration):
+        self.stack.append(_ClassInfo(
+            class_decl_match.group(3), class_decl_match.group(2),
+            clean_lines, linenum))
+        line = class_decl_match.group(4)
+
+    # If we have not yet seen the opening brace for the innermost block,
+    # run checks here.
+    if not self.SeenOpenBrace():
+      self.stack[-1].CheckBegin(filename, clean_lines, linenum, error)
+
+    # Update access control if we are inside a class/struct
+    if self.stack and isinstance(self.stack[-1], _ClassInfo):
+      classinfo = self.stack[-1]
+      access_match = Match(
+          r'^(.*)\b(public|private|protected|signals)(\s+(?:slots\s*)?)?'
+          r':(?:[^:]|$)',
+          line)
+      if access_match:
+        classinfo.access = access_match.group(2)
+
+        # Check that access keywords are indented +1 space.  Skip this
+        # check if the keywords are not preceded by whitespaces.
+        indent = access_match.group(1)
+        if (len(indent) != classinfo.class_indent + 1 and
+            Match(

<TRUNCATED>

[04/17] arrow git commit: ARROW-1: Initial Arrow Code Commit

Posted by ja...@apache.org.
http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedListVector.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedListVector.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedListVector.java
new file mode 100644
index 0000000..778fe81
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedListVector.java
@@ -0,0 +1,428 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex;
+
+import io.netty.buffer.ArrowBuf;
+
+import java.util.Iterator;
+import java.util.List;
+
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.memory.OutOfMemoryException;
+import org.apache.arrow.vector.AddOrGetResult;
+import org.apache.arrow.vector.UInt4Vector;
+import org.apache.arrow.vector.ValueVector;
+import org.apache.arrow.vector.VectorDescriptor;
+import org.apache.arrow.vector.complex.impl.NullReader;
+import org.apache.arrow.vector.complex.impl.RepeatedListReaderImpl;
+import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.holders.ComplexHolder;
+import org.apache.arrow.vector.holders.RepeatedListHolder;
+import org.apache.arrow.vector.types.MaterializedField;
+import org.apache.arrow.vector.types.Types.DataMode;
+import org.apache.arrow.vector.types.Types.MajorType;
+import org.apache.arrow.vector.types.Types.MinorType;
+import org.apache.arrow.vector.util.CallBack;
+import org.apache.arrow.vector.util.JsonStringArrayList;
+import org.apache.arrow.vector.util.TransferPair;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Lists;
+
+public class RepeatedListVector extends AbstractContainerVector
+    implements RepeatedValueVector, RepeatedFixedWidthVectorLike {
+
+  public final static MajorType TYPE = new MajorType(MinorType.LIST, DataMode.REPEATED);
+  private final RepeatedListReaderImpl reader = new RepeatedListReaderImpl(null, this);
+  private final DelegateRepeatedVector delegate;
+
+  protected static class DelegateRepeatedVector extends BaseRepeatedValueVector {
+
+    private final RepeatedListAccessor accessor = new RepeatedListAccessor();
+    private final RepeatedListMutator mutator = new RepeatedListMutator();
+    private final EmptyValuePopulator emptyPopulator;
+    private transient DelegateTransferPair ephPair;
+
+    public class RepeatedListAccessor extends BaseRepeatedValueVector.BaseRepeatedAccessor {
+
+      @Override
+      public Object getObject(int index) {
+        final List<Object> list = new JsonStringArrayList<>();
+        final int start = offsets.getAccessor().get(index);
+        final int until = offsets.getAccessor().get(index+1);
+        for (int i = start; i < until; i++) {
+          list.add(vector.getAccessor().getObject(i));
+        }
+        return list;
+      }
+
+      public void get(int index, RepeatedListHolder holder) {
+        assert index <= getValueCapacity();
+        holder.start = getOffsetVector().getAccessor().get(index);
+        holder.end = getOffsetVector().getAccessor().get(index+1);
+      }
+
+      public void get(int index, ComplexHolder holder) {
+        final FieldReader reader = getReader();
+        reader.setPosition(index);
+        holder.reader = reader;
+      }
+
+      public void get(int index, int arrayIndex, ComplexHolder holder) {
+        final RepeatedListHolder listHolder = new RepeatedListHolder();
+        get(index, listHolder);
+        int offset = listHolder.start + arrayIndex;
+        if (offset >= listHolder.end) {
+          holder.reader = NullReader.INSTANCE;
+        } else {
+          FieldReader r = getDataVector().getReader();
+          r.setPosition(offset);
+          holder.reader = r;
+        }
+      }
+    }
+
+    public class RepeatedListMutator extends BaseRepeatedValueVector.BaseRepeatedMutator {
+
+      public int add(int index) {
+        final int curEnd = getOffsetVector().getAccessor().get(index+1);
+        getOffsetVector().getMutator().setSafe(index + 1, curEnd + 1);
+        return curEnd;
+      }
+
+      @Override
+      public void startNewValue(int index) {
+        emptyPopulator.populate(index+1);
+        super.startNewValue(index);
+      }
+
+      @Override
+      public void setValueCount(int valueCount) {
+        emptyPopulator.populate(valueCount);
+        super.setValueCount(valueCount);
+      }
+    }
+
+
+    public class DelegateTransferPair implements TransferPair {
+      private final DelegateRepeatedVector target;
+      private final TransferPair[] children;
+
+      public DelegateTransferPair(DelegateRepeatedVector target) {
+        this.target = Preconditions.checkNotNull(target);
+        if (target.getDataVector() == DEFAULT_DATA_VECTOR) {
+          target.addOrGetVector(VectorDescriptor.create(getDataVector().getField()));
+          target.getDataVector().allocateNew();
+        }
+        this.children = new TransferPair[] {
+            getOffsetVector().makeTransferPair(target.getOffsetVector()),
+            getDataVector().makeTransferPair(target.getDataVector())
+        };
+      }
+
+      @Override
+      public void transfer() {
+        for (TransferPair child:children) {
+          child.transfer();
+        }
+      }
+
+      @Override
+      public ValueVector getTo() {
+        return target;
+      }
+
+      @Override
+      public void splitAndTransfer(int startIndex, int length) {
+        target.allocateNew();
+        for (int i = 0; i < length; i++) {
+          copyValueSafe(startIndex + i, i);
+        }
+      }
+
+      @Override
+      public void copyValueSafe(int srcIndex, int destIndex) {
+        final RepeatedListHolder holder = new RepeatedListHolder();
+        getAccessor().get(srcIndex, holder);
+        target.emptyPopulator.populate(destIndex+1);
+        final TransferPair vectorTransfer = children[1];
+        int newIndex = target.getOffsetVector().getAccessor().get(destIndex);
+        //todo: make this a bulk copy.
+        for (int i = holder.start; i < holder.end; i++, newIndex++) {
+          vectorTransfer.copyValueSafe(i, newIndex);
+        }
+        target.getOffsetVector().getMutator().setSafe(destIndex + 1, newIndex);
+      }
+    }
+
+    public DelegateRepeatedVector(String path, BufferAllocator allocator) {
+      this(MaterializedField.create(path, TYPE), allocator);
+    }
+
+    public DelegateRepeatedVector(MaterializedField field, BufferAllocator allocator) {
+      super(field, allocator);
+      emptyPopulator = new EmptyValuePopulator(getOffsetVector());
+    }
+
+    @Override
+    public void allocateNew() throws OutOfMemoryException {
+      if (!allocateNewSafe()) {
+        throw new OutOfMemoryException();
+      }
+    }
+
+    @Override
+    public TransferPair getTransferPair(String ref, BufferAllocator allocator) {
+      return makeTransferPair(new DelegateRepeatedVector(ref, allocator));
+    }
+
+    @Override
+    public TransferPair makeTransferPair(ValueVector target) {
+      return new DelegateTransferPair(DelegateRepeatedVector.class.cast(target));
+    }
+
+    @Override
+    public RepeatedListAccessor getAccessor() {
+      return accessor;
+    }
+
+    @Override
+    public RepeatedListMutator getMutator() {
+      return mutator;
+    }
+
+    @Override
+    public FieldReader getReader() {
+      throw new UnsupportedOperationException();
+    }
+
+    public void copyFromSafe(int fromIndex, int thisIndex, DelegateRepeatedVector from) {
+      if(ephPair == null || ephPair.target != from) {
+        ephPair = DelegateTransferPair.class.cast(from.makeTransferPair(this));
+      }
+      ephPair.copyValueSafe(fromIndex, thisIndex);
+    }
+
+  }
+
+  protected class RepeatedListTransferPair implements TransferPair {
+    private final TransferPair delegate;
+
+    public RepeatedListTransferPair(TransferPair delegate) {
+      this.delegate = delegate;
+    }
+
+    public void transfer() {
+      delegate.transfer();
+    }
+
+    @Override
+    public void splitAndTransfer(int startIndex, int length) {
+      delegate.splitAndTransfer(startIndex, length);
+    }
+
+    @Override
+    public ValueVector getTo() {
+      final DelegateRepeatedVector delegateVector = DelegateRepeatedVector.class.cast(delegate.getTo());
+      return new RepeatedListVector(getField(), allocator, callBack, delegateVector);
+    }
+
+    @Override
+    public void copyValueSafe(int from, int to) {
+      delegate.copyValueSafe(from, to);
+    }
+  }
+
+  public RepeatedListVector(String path, BufferAllocator allocator, CallBack callBack) {
+    this(MaterializedField.create(path, TYPE), allocator, callBack);
+  }
+
+  public RepeatedListVector(MaterializedField field, BufferAllocator allocator, CallBack callBack) {
+    this(field, allocator, callBack, new DelegateRepeatedVector(field, allocator));
+  }
+
+  protected RepeatedListVector(MaterializedField field, BufferAllocator allocator, CallBack callBack, DelegateRepeatedVector delegate) {
+    super(field, allocator, callBack);
+    this.delegate = Preconditions.checkNotNull(delegate);
+
+    final List<MaterializedField> children = Lists.newArrayList(field.getChildren());
+    final int childSize = children.size();
+    assert childSize < 3;
+    final boolean hasChild = childSize > 0;
+    if (hasChild) {
+      // the last field is data field
+      final MaterializedField child = children.get(childSize-1);
+      addOrGetVector(VectorDescriptor.create(child));
+    }
+  }
+
+
+    @Override
+  public RepeatedListReaderImpl getReader() {
+    return reader;
+  }
+
+  @Override
+  public DelegateRepeatedVector.RepeatedListAccessor getAccessor() {
+    return delegate.getAccessor();
+  }
+
+  @Override
+  public DelegateRepeatedVector.RepeatedListMutator getMutator() {
+    return delegate.getMutator();
+  }
+
+  @Override
+  public UInt4Vector getOffsetVector() {
+    return delegate.getOffsetVector();
+  }
+
+  @Override
+  public ValueVector getDataVector() {
+    return delegate.getDataVector();
+  }
+
+  @Override
+  public void allocateNew() throws OutOfMemoryException {
+    delegate.allocateNew();
+  }
+
+  @Override
+  public boolean allocateNewSafe() {
+    return delegate.allocateNewSafe();
+  }
+
+  @Override
+  public <T extends ValueVector> AddOrGetResult<T> addOrGetVector(VectorDescriptor descriptor) {
+    final AddOrGetResult<T> result = delegate.addOrGetVector(descriptor);
+    if (result.isCreated() && callBack != null) {
+      callBack.doWork();
+    }
+    this.field = delegate.getField();
+    return result;
+  }
+
+  @Override
+  public int size() {
+    return delegate.size();
+  }
+
+  @Override
+  public int getBufferSize() {
+    return delegate.getBufferSize();
+  }
+
+  @Override
+  public int getBufferSizeFor(final int valueCount) {
+    return delegate.getBufferSizeFor(valueCount);
+  }
+
+  @Override
+  public void close() {
+    delegate.close();
+  }
+
+  @Override
+  public void clear() {
+    delegate.clear();
+  }
+
+  @Override
+  public TransferPair getTransferPair(BufferAllocator allocator) {
+    return new RepeatedListTransferPair(delegate.getTransferPair(allocator));
+  }
+
+  @Override
+  public TransferPair getTransferPair(String ref, BufferAllocator allocator) {
+    return new RepeatedListTransferPair(delegate.getTransferPair(ref, allocator));
+  }
+
+  @Override
+  public TransferPair makeTransferPair(ValueVector to) {
+    final RepeatedListVector target = RepeatedListVector.class.cast(to);
+    return new RepeatedListTransferPair(delegate.makeTransferPair(target.delegate));
+  }
+
+  @Override
+  public int getValueCapacity() {
+    return delegate.getValueCapacity();
+  }
+
+  @Override
+  public ArrowBuf[] getBuffers(boolean clear) {
+    return delegate.getBuffers(clear);
+  }
+
+
+//  @Override
+//  public void load(SerializedField metadata, DrillBuf buf) {
+//    delegate.load(metadata, buf);
+//  }
+
+//  @Override
+//  public SerializedField getMetadata() {
+//    return delegate.getMetadata();
+//  }
+
+  @Override
+  public Iterator<ValueVector> iterator() {
+    return delegate.iterator();
+  }
+
+  @Override
+  public void setInitialCapacity(int numRecords) {
+    delegate.setInitialCapacity(numRecords);
+  }
+
+  /**
+   * @deprecated
+   *   prefer using {@link #addOrGetVector(org.apache.arrow.vector.VectorDescriptor)} instead.
+   */
+  @Override
+  public <T extends ValueVector> T addOrGet(String name, MajorType type, Class<T> clazz) {
+    final AddOrGetResult<T> result = addOrGetVector(VectorDescriptor.create(type));
+    return result.getVector();
+  }
+
+  @Override
+  public <T extends ValueVector> T getChild(String name, Class<T> clazz) {
+    if (name != null) {
+      return null;
+    }
+    return typeify(delegate.getDataVector(), clazz);
+  }
+
+  @Override
+  public void allocateNew(int valueCount, int innerValueCount) {
+    clear();
+    getOffsetVector().allocateNew(valueCount + 1);
+    getMutator().reset();
+  }
+
+  @Override
+  public VectorWithOrdinal getChildVectorWithOrdinal(String name) {
+    if (name != null) {
+      return null;
+    }
+    return new VectorWithOrdinal(delegate.getDataVector(), 0);
+  }
+
+  public void copyFromSafe(int fromIndex, int thisIndex, RepeatedListVector from) {
+    delegate.copyFromSafe(fromIndex, thisIndex, from.delegate);
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedMapVector.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedMapVector.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedMapVector.java
new file mode 100644
index 0000000..e7eacd3
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedMapVector.java
@@ -0,0 +1,584 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex;
+
+import io.netty.buffer.ArrowBuf;
+
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.memory.OutOfMemoryException;
+import org.apache.arrow.vector.AddOrGetResult;
+import org.apache.arrow.vector.AllocationHelper;
+import org.apache.arrow.vector.UInt4Vector;
+import org.apache.arrow.vector.ValueVector;
+import org.apache.arrow.vector.VectorDescriptor;
+import org.apache.arrow.vector.complex.impl.NullReader;
+import org.apache.arrow.vector.complex.impl.RepeatedMapReaderImpl;
+import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.holders.ComplexHolder;
+import org.apache.arrow.vector.holders.RepeatedMapHolder;
+import org.apache.arrow.vector.types.MaterializedField;
+import org.apache.arrow.vector.types.Types.DataMode;
+import org.apache.arrow.vector.types.Types.MajorType;
+import org.apache.arrow.vector.types.Types.MinorType;
+import org.apache.arrow.vector.util.CallBack;
+import org.apache.arrow.vector.util.JsonStringArrayList;
+import org.apache.arrow.vector.util.TransferPair;
+import org.apache.commons.lang3.ArrayUtils;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Maps;
+
+public class RepeatedMapVector extends AbstractMapVector
+    implements RepeatedValueVector, RepeatedFixedWidthVectorLike {
+  //private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(RepeatedMapVector.class);
+
+  public final static MajorType TYPE = new MajorType(MinorType.MAP, DataMode.REPEATED);
+
+  private final UInt4Vector offsets;   // offsets to start of each record (considering record indices are 0-indexed)
+  private final RepeatedMapReaderImpl reader = new RepeatedMapReaderImpl(RepeatedMapVector.this);
+  private final RepeatedMapAccessor accessor = new RepeatedMapAccessor();
+  private final Mutator mutator = new Mutator();
+  private final EmptyValuePopulator emptyPopulator;
+
+  public RepeatedMapVector(MaterializedField field, BufferAllocator allocator, CallBack callBack){
+    super(field, allocator, callBack);
+    this.offsets = new UInt4Vector(BaseRepeatedValueVector.OFFSETS_FIELD, allocator);
+    this.emptyPopulator = new EmptyValuePopulator(offsets);
+  }
+
+  @Override
+  public UInt4Vector getOffsetVector() {
+    return offsets;
+  }
+
+  @Override
+  public ValueVector getDataVector() {
+    throw new UnsupportedOperationException();
+  }
+
+  @Override
+  public <T extends ValueVector> AddOrGetResult<T> addOrGetVector(VectorDescriptor descriptor) {
+    throw new UnsupportedOperationException();
+  }
+
+  @Override
+  public void setInitialCapacity(int numRecords) {
+    offsets.setInitialCapacity(numRecords + 1);
+    for(final ValueVector v : (Iterable<ValueVector>) this) {
+      v.setInitialCapacity(numRecords * RepeatedValueVector.DEFAULT_REPEAT_PER_RECORD);
+    }
+  }
+
+  @Override
+  public RepeatedMapReaderImpl getReader() {
+    return reader;
+  }
+
+  @Override
+  public void allocateNew(int groupCount, int innerValueCount) {
+    clear();
+    try {
+      offsets.allocateNew(groupCount + 1);
+      for (ValueVector v : getChildren()) {
+        AllocationHelper.allocatePrecomputedChildCount(v, groupCount, 50, innerValueCount);
+      }
+    } catch (OutOfMemoryException e){
+      clear();
+      throw e;
+    }
+    offsets.zeroVector();
+    mutator.reset();
+  }
+
+  public Iterator<String> fieldNameIterator() {
+    return getChildFieldNames().iterator();
+  }
+
+  @Override
+  public List<ValueVector> getPrimitiveVectors() {
+    final List<ValueVector> primitiveVectors = super.getPrimitiveVectors();
+    primitiveVectors.add(offsets);
+    return primitiveVectors;
+  }
+
+  @Override
+  public int getBufferSize() {
+    if (getAccessor().getValueCount() == 0) {
+      return 0;
+    }
+    long bufferSize = offsets.getBufferSize();
+    for (final ValueVector v : (Iterable<ValueVector>) this) {
+      bufferSize += v.getBufferSize();
+    }
+    return (int) bufferSize;
+  }
+
+  @Override
+  public int getBufferSizeFor(final int valueCount) {
+    if (valueCount == 0) {
+      return 0;
+    }
+
+    long bufferSize = 0;
+    for (final ValueVector v : (Iterable<ValueVector>) this) {
+      bufferSize += v.getBufferSizeFor(valueCount);
+    }
+
+    return (int) bufferSize;
+  }
+
+  @Override
+  public void close() {
+    offsets.close();
+    super.close();
+  }
+
+  @Override
+  public TransferPair getTransferPair(BufferAllocator allocator) {
+    return new RepeatedMapTransferPair(this, getField().getPath(), allocator);
+  }
+
+  @Override
+  public TransferPair makeTransferPair(ValueVector to) {
+    return new RepeatedMapTransferPair(this, (RepeatedMapVector)to);
+  }
+
+  MapSingleCopier makeSingularCopier(MapVector to) {
+    return new MapSingleCopier(this, to);
+  }
+
+  protected static class MapSingleCopier {
+    private final TransferPair[] pairs;
+    public final RepeatedMapVector from;
+
+    public MapSingleCopier(RepeatedMapVector from, MapVector to) {
+      this.from = from;
+      this.pairs = new TransferPair[from.size()];
+
+      int i = 0;
+      ValueVector vector;
+      for (final String child:from.getChildFieldNames()) {
+        int preSize = to.size();
+        vector = from.getChild(child);
+        if (vector == null) {
+          continue;
+        }
+        final ValueVector newVector = to.addOrGet(child, vector.getField().getType(), vector.getClass());
+        if (to.size() != preSize) {
+          newVector.allocateNew();
+        }
+        pairs[i++] = vector.makeTransferPair(newVector);
+      }
+    }
+
+    public void copySafe(int fromSubIndex, int toIndex) {
+      for (TransferPair p : pairs) {
+        p.copyValueSafe(fromSubIndex, toIndex);
+      }
+    }
+  }
+
+  public TransferPair getTransferPairToSingleMap(String reference, BufferAllocator allocator) {
+    return new SingleMapTransferPair(this, reference, allocator);
+  }
+
+  @Override
+  public TransferPair getTransferPair(String ref, BufferAllocator allocator) {
+    return new RepeatedMapTransferPair(this, ref, allocator);
+  }
+
+  @Override
+  public boolean allocateNewSafe() {
+    /* boolean to keep track if all the memory allocation were successful
+     * Used in the case of composite vectors when we need to allocate multiple
+     * buffers for multiple vectors. If one of the allocations failed we need to
+     * clear all the memory that we allocated
+     */
+    boolean success = false;
+    try {
+      if (!offsets.allocateNewSafe()) {
+        return false;
+      }
+      success =  super.allocateNewSafe();
+    } finally {
+      if (!success) {
+        clear();
+      }
+    }
+    offsets.zeroVector();
+    return success;
+  }
+
+  protected static class SingleMapTransferPair implements TransferPair {
+    private final TransferPair[] pairs;
+    private final RepeatedMapVector from;
+    private final MapVector to;
+    private static final MajorType MAP_TYPE = new MajorType(MinorType.MAP, DataMode.REQUIRED);
+
+    public SingleMapTransferPair(RepeatedMapVector from, String path, BufferAllocator allocator) {
+      this(from, new MapVector(MaterializedField.create(path, MAP_TYPE), allocator, from.callBack), false);
+    }
+
+    public SingleMapTransferPair(RepeatedMapVector from, MapVector to) {
+      this(from, to, true);
+    }
+
+    public SingleMapTransferPair(RepeatedMapVector from, MapVector to, boolean allocate) {
+      this.from = from;
+      this.to = to;
+      this.pairs = new TransferPair[from.size()];
+      int i = 0;
+      ValueVector vector;
+      for (final String child : from.getChildFieldNames()) {
+        int preSize = to.size();
+        vector = from.getChild(child);
+        if (vector == null) {
+          continue;
+        }
+        final ValueVector newVector = to.addOrGet(child, vector.getField().getType(), vector.getClass());
+        if (allocate && to.size() != preSize) {
+          newVector.allocateNew();
+        }
+        pairs[i++] = vector.makeTransferPair(newVector);
+      }
+    }
+
+
+    @Override
+    public void transfer() {
+      for (TransferPair p : pairs) {
+        p.transfer();
+      }
+      to.getMutator().setValueCount(from.getAccessor().getValueCount());
+      from.clear();
+    }
+
+    @Override
+    public ValueVector getTo() {
+      return to;
+    }
+
+    @Override
+    public void copyValueSafe(int from, int to) {
+      for (TransferPair p : pairs) {
+        p.copyValueSafe(from, to);
+      }
+    }
+
+    @Override
+    public void splitAndTransfer(int startIndex, int length) {
+      for (TransferPair p : pairs) {
+        p.splitAndTransfer(startIndex, length);
+      }
+      to.getMutator().setValueCount(length);
+    }
+  }
+
+  private static class RepeatedMapTransferPair implements TransferPair{
+
+    private final TransferPair[] pairs;
+    private final RepeatedMapVector to;
+    private final RepeatedMapVector from;
+
+    public RepeatedMapTransferPair(RepeatedMapVector from, String path, BufferAllocator allocator) {
+      this(from, new RepeatedMapVector(MaterializedField.create(path, TYPE), allocator, from.callBack), false);
+    }
+
+    public RepeatedMapTransferPair(RepeatedMapVector from, RepeatedMapVector to) {
+      this(from, to, true);
+    }
+
+    public RepeatedMapTransferPair(RepeatedMapVector from, RepeatedMapVector to, boolean allocate) {
+      this.from = from;
+      this.to = to;
+      this.pairs = new TransferPair[from.size()];
+      this.to.ephPair = null;
+
+      int i = 0;
+      ValueVector vector;
+      for (final String child : from.getChildFieldNames()) {
+        final int preSize = to.size();
+        vector = from.getChild(child);
+        if (vector == null) {
+          continue;
+        }
+
+        final ValueVector newVector = to.addOrGet(child, vector.getField().getType(), vector.getClass());
+        if (to.size() != preSize) {
+          newVector.allocateNew();
+        }
+
+        pairs[i++] = vector.makeTransferPair(newVector);
+      }
+    }
+
+    @Override
+    public void transfer() {
+      from.offsets.transferTo(to.offsets);
+      for (TransferPair p : pairs) {
+        p.transfer();
+      }
+      from.clear();
+    }
+
+    @Override
+    public ValueVector getTo() {
+      return to;
+    }
+
+    @Override
+    public void copyValueSafe(int srcIndex, int destIndex) {
+      RepeatedMapHolder holder = new RepeatedMapHolder();
+      from.getAccessor().get(srcIndex, holder);
+      to.emptyPopulator.populate(destIndex + 1);
+      int newIndex = to.offsets.getAccessor().get(destIndex);
+      //todo: make these bulk copies
+      for (int i = holder.start; i < holder.end; i++, newIndex++) {
+        for (TransferPair p : pairs) {
+          p.copyValueSafe(i, newIndex);
+        }
+      }
+      to.offsets.getMutator().setSafe(destIndex + 1, newIndex);
+    }
+
+    @Override
+    public void splitAndTransfer(final int groupStart, final int groups) {
+      final UInt4Vector.Accessor a = from.offsets.getAccessor();
+      final UInt4Vector.Mutator m = to.offsets.getMutator();
+
+      final int startPos = a.get(groupStart);
+      final int endPos = a.get(groupStart + groups);
+      final int valuesToCopy = endPos - startPos;
+
+      to.offsets.clear();
+      to.offsets.allocateNew(groups + 1);
+
+      int normalizedPos;
+      for (int i = 0; i < groups + 1; i++) {
+        normalizedPos = a.get(groupStart + i) - startPos;
+        m.set(i, normalizedPos);
+      }
+
+      m.setValueCount(groups + 1);
+      to.emptyPopulator.populate(groups);
+
+      for (final TransferPair p : pairs) {
+        p.splitAndTransfer(startPos, valuesToCopy);
+      }
+    }
+  }
+
+
+  transient private RepeatedMapTransferPair ephPair;
+
+  public void copyFromSafe(int fromIndex, int thisIndex, RepeatedMapVector from) {
+    if (ephPair == null || ephPair.from != from) {
+      ephPair = (RepeatedMapTransferPair) from.makeTransferPair(this);
+    }
+    ephPair.copyValueSafe(fromIndex, thisIndex);
+  }
+
+  @Override
+  public int getValueCapacity() {
+    return Math.max(offsets.getValueCapacity() - 1, 0);
+  }
+
+  @Override
+  public RepeatedMapAccessor getAccessor() {
+    return accessor;
+  }
+
+  @Override
+  public ArrowBuf[] getBuffers(boolean clear) {
+    final int expectedBufferSize = getBufferSize();
+    final int actualBufferSize = super.getBufferSize();
+
+    Preconditions.checkArgument(expectedBufferSize == actualBufferSize + offsets.getBufferSize());
+    return ArrayUtils.addAll(offsets.getBuffers(clear), super.getBuffers(clear));
+  }
+
+
+//  @Override
+//  public void load(SerializedField metadata, DrillBuf buffer) {
+//    final List<SerializedField> children = metadata.getChildList();
+//
+//    final SerializedField offsetField = children.get(0);
+//    offsets.load(offsetField, buffer);
+//    int bufOffset = offsetField.getBufferLength();
+//
+//    for (int i = 1; i < children.size(); i++) {
+//      final SerializedField child = children.get(i);
+//      final MaterializedField fieldDef = SerializedFieldHelper.create(child);
+//      ValueVector vector = getChild(fieldDef.getLastName());
+//      if (vector == null) {
+        // if we arrive here, we didn't have a matching vector.
+//        vector = BasicTypeHelper.getNewVector(fieldDef, allocator);
+//        putChild(fieldDef.getLastName(), vector);
+//      }
+//      final int vectorLength = child.getBufferLength();
+//      vector.load(child, buffer.slice(bufOffset, vectorLength));
+//      bufOffset += vectorLength;
+//    }
+//
+//    assert bufOffset == buffer.capacity();
+//  }
+//
+//
+//  @Override
+//  public SerializedField getMetadata() {
+//    SerializedField.Builder builder = getField() //
+//        .getAsBuilder() //
+//        .setBufferLength(getBufferSize()) //
+        // while we don't need to actually read this on load, we need it to make sure we don't skip deserialization of this vector
+//        .setValueCount(accessor.getValueCount());
+//    builder.addChild(offsets.getMetadata());
+//    for (final ValueVector child : getChildren()) {
+//      builder.addChild(child.getMetadata());
+//    }
+//    return builder.build();
+//  }
+
+  @Override
+  public Mutator getMutator() {
+    return mutator;
+  }
+
+  public class RepeatedMapAccessor implements RepeatedAccessor {
+    @Override
+    public Object getObject(int index) {
+      final List<Object> list = new JsonStringArrayList<>();
+      final int end = offsets.getAccessor().get(index+1);
+      String fieldName;
+      for (int i =  offsets.getAccessor().get(index); i < end; i++) {
+        final Map<String, Object> vv = Maps.newLinkedHashMap();
+        for (final MaterializedField field : getField().getChildren()) {
+          if (!field.equals(BaseRepeatedValueVector.OFFSETS_FIELD)) {
+            fieldName = field.getLastName();
+            final Object value = getChild(fieldName).getAccessor().getObject(i);
+            if (value != null) {
+              vv.put(fieldName, value);
+            }
+          }
+        }
+        list.add(vv);
+      }
+      return list;
+    }
+
+    @Override
+    public int getValueCount() {
+      return Math.max(offsets.getAccessor().getValueCount() - 1, 0);
+    }
+
+    @Override
+    public int getInnerValueCount() {
+      final int valueCount = getValueCount();
+      if (valueCount == 0) {
+        return 0;
+      }
+      return offsets.getAccessor().get(valueCount);
+    }
+
+    @Override
+    public int getInnerValueCountAt(int index) {
+      return offsets.getAccessor().get(index+1) - offsets.getAccessor().get(index);
+    }
+
+    @Override
+    public boolean isEmpty(int index) {
+      return false;
+    }
+
+    @Override
+    public boolean isNull(int index) {
+      return false;
+    }
+
+    public void get(int index, RepeatedMapHolder holder) {
+      assert index < getValueCapacity() :
+        String.format("Attempted to access index %d when value capacity is %d",
+            index, getValueCapacity());
+      final UInt4Vector.Accessor offsetsAccessor = offsets.getAccessor();
+      holder.start = offsetsAccessor.get(index);
+      holder.end = offsetsAccessor.get(index + 1);
+    }
+
+    public void get(int index, ComplexHolder holder) {
+      final FieldReader reader = getReader();
+      reader.setPosition(index);
+      holder.reader = reader;
+    }
+
+    public void get(int index, int arrayIndex, ComplexHolder holder) {
+      final RepeatedMapHolder h = new RepeatedMapHolder();
+      get(index, h);
+      final int offset = h.start + arrayIndex;
+
+      if (offset >= h.end) {
+        holder.reader = NullReader.INSTANCE;
+      } else {
+        reader.setSinglePosition(index, arrayIndex);
+        holder.reader = reader;
+      }
+    }
+  }
+
+  public class Mutator implements RepeatedMutator {
+    @Override
+    public void startNewValue(int index) {
+      emptyPopulator.populate(index + 1);
+      offsets.getMutator().setSafe(index + 1, offsets.getAccessor().get(index));
+    }
+
+    @Override
+    public void setValueCount(int topLevelValueCount) {
+      emptyPopulator.populate(topLevelValueCount);
+      offsets.getMutator().setValueCount(topLevelValueCount == 0 ? 0 : topLevelValueCount + 1);
+      int childValueCount = offsets.getAccessor().get(topLevelValueCount);
+      for (final ValueVector v : getChildren()) {
+        v.getMutator().setValueCount(childValueCount);
+      }
+    }
+
+    @Override
+    public void reset() {}
+
+    @Override
+    public void generateTestData(int values) {}
+
+    public int add(int index) {
+      final int prevEnd = offsets.getAccessor().get(index + 1);
+      offsets.getMutator().setSafe(index + 1, prevEnd + 1);
+      return prevEnd;
+    }
+  }
+
+  @Override
+  public void clear() {
+    getMutator().reset();
+
+    offsets.clear();
+    for(final ValueVector vector : getChildren()) {
+      vector.clear();
+    }
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedValueVector.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedValueVector.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedValueVector.java
new file mode 100644
index 0000000..99c0a0a
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedValueVector.java
@@ -0,0 +1,85 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex;
+
+import org.apache.arrow.vector.UInt4Vector;
+import org.apache.arrow.vector.ValueVector;
+
+/**
+ * An abstraction representing repeated value vectors.
+ *
+ * A repeated vector contains values that may either be flat or nested. A value consists of zero or more cells(inner values).
+ * Current design maintains data and offsets vectors. Each cell is stored in the data vector. Repeated vector
+ * uses the offset vector to determine the sequence of cells pertaining to an individual value.
+ *
+ */
+public interface RepeatedValueVector extends ValueVector, ContainerVectorLike {
+
+  final static int DEFAULT_REPEAT_PER_RECORD = 5;
+
+  /**
+   * Returns the underlying offset vector or null if none exists.
+   *
+   * TODO(DRILL-2995): eliminate exposing low-level interfaces.
+   */
+  UInt4Vector getOffsetVector();
+
+  /**
+   * Returns the underlying data vector or null if none exists.
+   */
+  ValueVector getDataVector();
+
+  @Override
+  RepeatedAccessor getAccessor();
+
+  @Override
+  RepeatedMutator getMutator();
+
+  interface RepeatedAccessor extends ValueVector.Accessor {
+    /**
+     * Returns total number of cells that vector contains.
+     *
+     * The result includes empty, null valued cells.
+     */
+    int getInnerValueCount();
+
+
+    /**
+     * Returns number of cells that the value at the given index contains.
+     */
+    int getInnerValueCountAt(int index);
+
+    /**
+     * Returns true if the value at the given index is empty, false otherwise.
+     *
+     * @param index  value index
+     */
+    boolean isEmpty(int index);
+  }
+
+  interface RepeatedMutator extends ValueVector.Mutator {
+    /**
+     * Starts a new value that is a container of cells.
+     *
+     * @param index  index of new value to start
+     */
+    void startNewValue(int index);
+
+
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedVariableWidthVectorLike.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedVariableWidthVectorLike.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedVariableWidthVectorLike.java
new file mode 100644
index 0000000..93b744e
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/RepeatedVariableWidthVectorLike.java
@@ -0,0 +1,35 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex;
+
+public interface RepeatedVariableWidthVectorLike {
+  /**
+   * Allocate a new memory space for this vector.  Must be called prior to using the ValueVector.
+   *
+   * @param totalBytes   Desired size of the underlying data buffer.
+   * @param parentValueCount   Number of separate repeating groupings.
+   * @param childValueCount   Number of supported values in the vector.
+   */
+  void allocateNew(int totalBytes, int parentValueCount, int childValueCount);
+
+  /**
+   * Provide the maximum amount of variable width bytes that can be stored int his vector.
+   * @return
+   */
+  int getByteCapacity();
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/StateTool.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/StateTool.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/StateTool.java
new file mode 100644
index 0000000..852c72c
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/StateTool.java
@@ -0,0 +1,34 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex;
+
+import java.util.Arrays;
+
+public class StateTool {
+  static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(StateTool.class);
+
+  public static <T extends Enum<?>> void check(T currentState, T... expectedStates) {
+    for (T s : expectedStates) {
+      if (s == currentState) {
+        return;
+      }
+    }
+    throw new IllegalArgumentException(String.format("Expected to be in one of these states %s but was actuall in state %s", Arrays.toString(expectedStates), currentState));
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/VectorWithOrdinal.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/VectorWithOrdinal.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/VectorWithOrdinal.java
new file mode 100644
index 0000000..d04fc1c
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/VectorWithOrdinal.java
@@ -0,0 +1,30 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex;
+
+import org.apache.arrow.vector.ValueVector;
+
+public class VectorWithOrdinal {
+  public final ValueVector vector;
+  public final int ordinal;
+
+  public VectorWithOrdinal(ValueVector v, int ordinal) {
+    this.vector = v;
+    this.ordinal = ordinal;
+  }
+}
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/AbstractBaseReader.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/AbstractBaseReader.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/AbstractBaseReader.java
new file mode 100644
index 0000000..264e241
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/AbstractBaseReader.java
@@ -0,0 +1,100 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex.impl;
+
+import java.util.Iterator;
+
+import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.complex.writer.BaseWriter.ListWriter;
+import org.apache.arrow.vector.complex.writer.FieldWriter;
+import org.apache.arrow.vector.holders.UnionHolder;
+import org.apache.arrow.vector.types.MaterializedField;
+import org.apache.arrow.vector.types.Types.DataMode;
+import org.apache.arrow.vector.types.Types.MajorType;
+import org.apache.arrow.vector.types.Types.MinorType;
+
+
+abstract class AbstractBaseReader implements FieldReader{
+
+  static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(AbstractBaseReader.class);
+  private static final MajorType LATE_BIND_TYPE = new MajorType(MinorType.LATE, DataMode.OPTIONAL);
+
+  private int index;
+
+  public AbstractBaseReader() {
+    super();
+  }
+
+  public void setPosition(int index){
+    this.index = index;
+  }
+
+  int idx(){
+    return index;
+  }
+
+  @Override
+  public void reset() {
+    index = 0;
+  }
+
+  @Override
+  public Iterator<String> iterator() {
+    throw new IllegalStateException("The current reader doesn't support reading as a map.");
+  }
+
+  public MajorType getType(){
+    throw new IllegalStateException("The current reader doesn't support getting type information.");
+  }
+
+  @Override
+  public MaterializedField getField() {
+    return MaterializedField.create("unknown", LATE_BIND_TYPE);
+  }
+
+  @Override
+  public boolean next() {
+    throw new IllegalStateException("The current reader doesn't support getting next information.");
+  }
+
+  @Override
+  public int size() {
+    throw new IllegalStateException("The current reader doesn't support getting size information.");
+  }
+
+  @Override
+  public void read(UnionHolder holder) {
+    holder.reader = this;
+    holder.isSet = this.isSet() ? 1 : 0;
+  }
+
+  @Override
+  public void read(int index, UnionHolder holder) {
+    throw new IllegalStateException("The current reader doesn't support reading union type");
+  }
+
+  @Override
+  public void copyAsValue(UnionWriter writer) {
+    throw new IllegalStateException("The current reader doesn't support reading union type");
+  }
+
+  @Override
+  public void copyAsValue(ListWriter writer) {
+    ComplexCopier.copy(this, (FieldWriter)writer);
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/AbstractBaseWriter.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/AbstractBaseWriter.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/AbstractBaseWriter.java
new file mode 100644
index 0000000..4e1e103
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/AbstractBaseWriter.java
@@ -0,0 +1,59 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex.impl;
+
+import org.apache.arrow.vector.complex.writer.FieldWriter;
+
+
+abstract class AbstractBaseWriter implements FieldWriter {
+  //private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(AbstractBaseWriter.class);
+
+  final FieldWriter parent;
+  private int index;
+
+  public AbstractBaseWriter(FieldWriter parent) {
+    this.parent = parent;
+  }
+
+  @Override
+  public String toString() {
+    return super.toString() + "[index = " + index + ", parent = " + parent + "]";
+  }
+
+  @Override
+  public FieldWriter getParent() {
+    return parent;
+  }
+
+  public boolean isRoot() {
+    return parent == null;
+  }
+
+  int idx() {
+    return index;
+  }
+
+  @Override
+  public void setPosition(int index) {
+    this.index = index;
+  }
+
+  @Override
+  public void end() {
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/ComplexWriterImpl.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/ComplexWriterImpl.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/ComplexWriterImpl.java
new file mode 100644
index 0000000..4e2051f
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/ComplexWriterImpl.java
@@ -0,0 +1,193 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex.impl;
+
+import org.apache.arrow.vector.complex.MapVector;
+import org.apache.arrow.vector.complex.StateTool;
+import org.apache.arrow.vector.complex.writer.BaseWriter.ComplexWriter;
+import org.apache.arrow.vector.types.MaterializedField;
+import org.apache.arrow.vector.types.Types;
+import org.apache.arrow.vector.types.Types.MinorType;
+
+import com.google.common.base.Preconditions;
+
+public class ComplexWriterImpl extends AbstractFieldWriter implements ComplexWriter {
+//  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(ComplexWriterImpl.class);
+
+  private SingleMapWriter mapRoot;
+  private SingleListWriter listRoot;
+  private final MapVector container;
+
+  Mode mode = Mode.INIT;
+  private final String name;
+  private final boolean unionEnabled;
+
+  private enum Mode { INIT, MAP, LIST };
+
+  public ComplexWriterImpl(String name, MapVector container, boolean unionEnabled){
+    super(null);
+    this.name = name;
+    this.container = container;
+    this.unionEnabled = unionEnabled;
+  }
+
+  public ComplexWriterImpl(String name, MapVector container){
+    this(name, container, false);
+  }
+
+  @Override
+  public MaterializedField getField() {
+    return container.getField();
+  }
+
+  @Override
+  public int getValueCapacity() {
+    return container.getValueCapacity();
+  }
+
+  private void check(Mode... modes){
+    StateTool.check(mode, modes);
+  }
+
+  @Override
+  public void reset(){
+    setPosition(0);
+  }
+
+  @Override
+  public void close() throws Exception {
+    clear();
+    mapRoot.close();
+    if (listRoot != null) {
+      listRoot.close();
+    }
+  }
+
+  @Override
+  public void clear(){
+    switch(mode){
+    case MAP:
+      mapRoot.clear();
+      break;
+    case LIST:
+      listRoot.clear();
+      break;
+    }
+  }
+
+  @Override
+  public void setValueCount(int count){
+    switch(mode){
+    case MAP:
+      mapRoot.setValueCount(count);
+      break;
+    case LIST:
+      listRoot.setValueCount(count);
+      break;
+    }
+  }
+
+  @Override
+  public void setPosition(int index){
+    super.setPosition(index);
+    switch(mode){
+    case MAP:
+      mapRoot.setPosition(index);
+      break;
+    case LIST:
+      listRoot.setPosition(index);
+      break;
+    }
+  }
+
+
+  public MapWriter directMap(){
+    Preconditions.checkArgument(name == null);
+
+    switch(mode){
+
+    case INIT:
+      MapVector map = (MapVector) container;
+      mapRoot = new SingleMapWriter(map, this, unionEnabled);
+      mapRoot.setPosition(idx());
+      mode = Mode.MAP;
+      break;
+
+    case MAP:
+      break;
+
+    default:
+        check(Mode.INIT, Mode.MAP);
+    }
+
+    return mapRoot;
+  }
+
+  @Override
+  public MapWriter rootAsMap() {
+    switch(mode){
+
+    case INIT:
+      MapVector map = container.addOrGet(name, Types.required(MinorType.MAP), MapVector.class);
+      mapRoot = new SingleMapWriter(map, this, unionEnabled);
+      mapRoot.setPosition(idx());
+      mode = Mode.MAP;
+      break;
+
+    case MAP:
+      break;
+
+    default:
+        check(Mode.INIT, Mode.MAP);
+    }
+
+    return mapRoot;
+  }
+
+
+  @Override
+  public void allocate() {
+    if(mapRoot != null) {
+      mapRoot.allocate();
+    } else if(listRoot != null) {
+      listRoot.allocate();
+    }
+  }
+
+  @Override
+  public ListWriter rootAsList() {
+    switch(mode){
+
+    case INIT:
+      listRoot = new SingleListWriter(name, container, this);
+      listRoot.setPosition(idx());
+      mode = Mode.LIST;
+      break;
+
+    case LIST:
+      break;
+
+    default:
+        check(Mode.INIT, Mode.MAP);
+    }
+
+    return listRoot;
+  }
+
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/MapOrListWriterImpl.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/MapOrListWriterImpl.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/MapOrListWriterImpl.java
new file mode 100644
index 0000000..f8a9d42
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/MapOrListWriterImpl.java
@@ -0,0 +1,112 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex.impl;
+
+import org.apache.arrow.vector.complex.writer.BaseWriter;
+import org.apache.arrow.vector.complex.writer.BaseWriter.MapOrListWriter;
+import org.apache.arrow.vector.complex.writer.BigIntWriter;
+import org.apache.arrow.vector.complex.writer.BitWriter;
+import org.apache.arrow.vector.complex.writer.Float4Writer;
+import org.apache.arrow.vector.complex.writer.Float8Writer;
+import org.apache.arrow.vector.complex.writer.IntWriter;
+import org.apache.arrow.vector.complex.writer.VarBinaryWriter;
+import org.apache.arrow.vector.complex.writer.VarCharWriter;
+
+public class MapOrListWriterImpl implements MapOrListWriter {
+
+  public final BaseWriter.MapWriter map;
+  public final BaseWriter.ListWriter list;
+
+  public MapOrListWriterImpl(final BaseWriter.MapWriter writer) {
+    this.map = writer;
+    this.list = null;
+  }
+
+  public MapOrListWriterImpl(final BaseWriter.ListWriter writer) {
+    this.map = null;
+    this.list = writer;
+  }
+
+  public void start() {
+    if (map != null) {
+      map.start();
+    } else {
+      list.startList();
+    }
+  }
+
+  public void end() {
+    if (map != null) {
+      map.end();
+    } else {
+      list.endList();
+    }
+  }
+
+  public MapOrListWriter map(final String name) {
+    assert map != null;
+    return new MapOrListWriterImpl(map.map(name));
+  }
+
+  public MapOrListWriter listoftmap(final String name) {
+    assert list != null;
+    return new MapOrListWriterImpl(list.map());
+  }
+
+  public MapOrListWriter list(final String name) {
+    assert map != null;
+    return new MapOrListWriterImpl(map.list(name));
+  }
+
+  public boolean isMapWriter() {
+    return map != null;
+  }
+
+  public boolean isListWriter() {
+    return list != null;
+  }
+
+  public VarCharWriter varChar(final String name) {
+    return (map != null) ? map.varChar(name) : list.varChar();
+  }
+
+  public IntWriter integer(final String name) {
+    return (map != null) ? map.integer(name) : list.integer();
+  }
+
+  public BigIntWriter bigInt(final String name) {
+    return (map != null) ? map.bigInt(name) : list.bigInt();
+  }
+
+  public Float4Writer float4(final String name) {
+    return (map != null) ? map.float4(name) : list.float4();
+  }
+
+  public Float8Writer float8(final String name) {
+    return (map != null) ? map.float8(name) : list.float8();
+  }
+
+  public BitWriter bit(final String name) {
+    return (map != null) ? map.bit(name) : list.bit();
+  }
+
+  public VarBinaryWriter binary(final String name) {
+    return (map != null) ? map.varBinary(name) : list.varBinary();
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/PromotableWriter.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/PromotableWriter.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/PromotableWriter.java
new file mode 100644
index 0000000..ea62e02
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/PromotableWriter.java
@@ -0,0 +1,196 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.arrow.vector.complex.impl;
+
+import java.lang.reflect.Constructor;
+
+import org.apache.arrow.vector.ValueVector;
+import org.apache.arrow.vector.VectorDescriptor;
+import org.apache.arrow.vector.ZeroVector;
+import org.apache.arrow.vector.complex.AbstractMapVector;
+import org.apache.arrow.vector.complex.ListVector;
+import org.apache.arrow.vector.complex.UnionVector;
+import org.apache.arrow.vector.complex.writer.FieldWriter;
+import org.apache.arrow.vector.types.MaterializedField;
+import org.apache.arrow.vector.types.Types.DataMode;
+import org.apache.arrow.vector.types.Types.MajorType;
+import org.apache.arrow.vector.types.Types.MinorType;
+import org.apache.arrow.vector.util.BasicTypeHelper;
+import org.apache.arrow.vector.util.TransferPair;
+
+/**
+ * This FieldWriter implementation delegates all FieldWriter API calls to an inner FieldWriter. This inner field writer
+ * can start as a specific type, and this class will promote the writer to a UnionWriter if a call is made that the specifically
+ * typed writer cannot handle. A new UnionVector is created, wrapping the original vector, and replaces the original vector
+ * in the parent vector, which can be either an AbstractMapVector or a ListVector.
+ */
+public class PromotableWriter extends AbstractPromotableFieldWriter {
+
+  private final AbstractMapVector parentContainer;
+  private final ListVector listVector;
+  private int position;
+
+  private enum State {
+    UNTYPED, SINGLE, UNION
+  }
+
+  private MinorType type;
+  private ValueVector vector;
+  private UnionVector unionVector;
+  private State state;
+  private FieldWriter writer;
+
+  public PromotableWriter(ValueVector v, AbstractMapVector parentContainer) {
+    super(null);
+    this.parentContainer = parentContainer;
+    this.listVector = null;
+    init(v);
+  }
+
+  public PromotableWriter(ValueVector v, ListVector listVector) {
+    super(null);
+    this.listVector = listVector;
+    this.parentContainer = null;
+    init(v);
+  }
+
+  private void init(ValueVector v) {
+    if (v instanceof UnionVector) {
+      state = State.UNION;
+      unionVector = (UnionVector) v;
+      writer = new UnionWriter(unionVector);
+    } else if (v instanceof ZeroVector) {
+      state = State.UNTYPED;
+    } else {
+      setWriter(v);
+    }
+  }
+
+  private void setWriter(ValueVector v) {
+    state = State.SINGLE;
+    vector = v;
+    type = v.getField().getType().getMinorType();
+    Class writerClass = BasicTypeHelper
+        .getWriterImpl(v.getField().getType().getMinorType(), v.getField().getDataMode());
+    if (writerClass.equals(SingleListWriter.class)) {
+      writerClass = UnionListWriter.class;
+    }
+    Class vectorClass = BasicTypeHelper.getValueVectorClass(v.getField().getType().getMinorType(), v.getField()
+        .getDataMode());
+    try {
+      Constructor constructor = null;
+      for (Constructor c : writerClass.getConstructors()) {
+        if (c.getParameterTypes().length == 3) {
+          constructor = c;
+        }
+      }
+      if (constructor == null) {
+        constructor = writerClass.getConstructor(vectorClass, AbstractFieldWriter.class);
+        writer = (FieldWriter) constructor.newInstance(vector, null);
+      } else {
+        writer = (FieldWriter) constructor.newInstance(vector, null, true);
+      }
+    } catch (ReflectiveOperationException e) {
+      throw new RuntimeException(e);
+    }
+  }
+
+  @Override
+  public void setPosition(int index) {
+    super.setPosition(index);
+    FieldWriter w = getWriter();
+    if (w == null) {
+      position = index;
+    } else {
+      w.setPosition(index);
+    }
+  }
+
+  protected FieldWriter getWriter(MinorType type) {
+    if (state == State.UNION) {
+      return writer;
+    }
+    if (state == State.UNTYPED) {
+      if (type == null) {
+        return null;
+      }
+      ValueVector v = listVector.addOrGetVector(new VectorDescriptor(new MajorType(type, DataMode.OPTIONAL))).getVector();
+      v.allocateNew();
+      setWriter(v);
+      writer.setPosition(position);
+    }
+    if (type != this.type) {
+      return promoteToUnion();
+    }
+    return writer;
+  }
+
+  @Override
+  public boolean isEmptyMap() {
+    return writer.isEmptyMap();
+  }
+
+  protected FieldWriter getWriter() {
+    return getWriter(type);
+  }
+
+  private FieldWriter promoteToUnion() {
+    String name = vector.getField().getLastName();
+    TransferPair tp = vector.getTransferPair(vector.getField().getType().getMinorType().name().toLowerCase(), vector.getAllocator());
+    tp.transfer();
+    if (parentContainer != null) {
+      unionVector = parentContainer.addOrGet(name, new MajorType(MinorType.UNION, DataMode.OPTIONAL), UnionVector.class);
+    } else if (listVector != null) {
+      unionVector = listVector.promoteToUnion();
+    }
+    unionVector.addVector(tp.getTo());
+    writer = new UnionWriter(unionVector);
+    writer.setPosition(idx());
+    for (int i = 0; i < idx(); i++) {
+      unionVector.getMutator().setType(i, vector.getField().getType().getMinorType());
+    }
+    vector = null;
+    state = State.UNION;
+    return writer;
+  }
+
+  @Override
+  public void allocate() {
+    getWriter().allocate();
+  }
+
+  @Override
+  public void clear() {
+    getWriter().clear();
+  }
+
+  @Override
+  public MaterializedField getField() {
+    return getWriter().getField();
+  }
+
+  @Override
+  public int getValueCapacity() {
+    return getWriter().getValueCapacity();
+  }
+
+  @Override
+  public void close() throws Exception {
+    getWriter().close();
+  }
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/RepeatedListReaderImpl.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/RepeatedListReaderImpl.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/RepeatedListReaderImpl.java
new file mode 100644
index 0000000..dd1a152
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/RepeatedListReaderImpl.java
@@ -0,0 +1,145 @@
+/*******************************************************************************
+
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ ******************************************************************************/
+package org.apache.arrow.vector.complex.impl;
+
+
+import org.apache.arrow.vector.ValueVector;
+import org.apache.arrow.vector.complex.RepeatedListVector;
+import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.complex.writer.BaseWriter.ListWriter;
+import org.apache.arrow.vector.complex.writer.BaseWriter.MapWriter;
+import org.apache.arrow.vector.holders.RepeatedListHolder;
+import org.apache.arrow.vector.types.Types.DataMode;
+import org.apache.arrow.vector.types.Types.MajorType;
+import org.apache.arrow.vector.types.Types.MinorType;
+
+public class RepeatedListReaderImpl extends AbstractFieldReader{
+  private static final int NO_VALUES = Integer.MAX_VALUE - 1;
+  private static final MajorType TYPE = new MajorType(MinorType.LIST, DataMode.REPEATED);
+  private final String name;
+  private final RepeatedListVector container;
+  private FieldReader reader;
+
+  public RepeatedListReaderImpl(String name, RepeatedListVector container) {
+    super();
+    this.name = name;
+    this.container = container;
+  }
+
+  @Override
+  public MajorType getType() {
+    return TYPE;
+  }
+
+  @Override
+  public void copyAsValue(ListWriter writer) {
+    if (currentOffset == NO_VALUES) {
+      return;
+    }
+    RepeatedListWriter impl = (RepeatedListWriter) writer;
+    impl.container.copyFromSafe(idx(), impl.idx(), container);
+  }
+
+  @Override
+  public void copyAsField(String name, MapWriter writer) {
+    if (currentOffset == NO_VALUES) {
+      return;
+    }
+    RepeatedListWriter impl = (RepeatedListWriter) writer.list(name);
+    impl.container.copyFromSafe(idx(), impl.idx(), container);
+  }
+
+  private int currentOffset;
+  private int maxOffset;
+
+  @Override
+  public void reset() {
+    super.reset();
+    currentOffset = 0;
+    maxOffset = 0;
+    if (reader != null) {
+      reader.reset();
+    }
+    reader = null;
+  }
+
+  @Override
+  public int size() {
+    return maxOffset - currentOffset;
+  }
+
+  @Override
+  public void setPosition(int index) {
+    if (index < 0 || index == NO_VALUES) {
+      currentOffset = NO_VALUES;
+      return;
+    }
+
+    super.setPosition(index);
+    RepeatedListHolder h = new RepeatedListHolder();
+    container.getAccessor().get(index, h);
+    if (h.start == h.end) {
+      currentOffset = NO_VALUES;
+    } else {
+      currentOffset = h.start-1;
+      maxOffset = h.end;
+      if(reader != null) {
+        reader.setPosition(currentOffset);
+      }
+    }
+  }
+
+  @Override
+  public boolean next() {
+    if (currentOffset +1 < maxOffset) {
+      currentOffset++;
+      if (reader != null) {
+        reader.setPosition(currentOffset);
+      }
+      return true;
+    } else {
+      currentOffset = NO_VALUES;
+      return false;
+    }
+  }
+
+  @Override
+  public Object readObject() {
+    return container.getAccessor().getObject(idx());
+  }
+
+  @Override
+  public FieldReader reader() {
+    if (reader == null) {
+      ValueVector child = container.getChild(name);
+      if (child == null) {
+        reader = NullReader.INSTANCE;
+      } else {
+        reader = child.getReader();
+      }
+      reader.setPosition(currentOffset);
+    }
+    return reader;
+  }
+
+  public boolean isSet() {
+    return true;
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/RepeatedMapReaderImpl.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/RepeatedMapReaderImpl.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/RepeatedMapReaderImpl.java
new file mode 100644
index 0000000..09a831d
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/RepeatedMapReaderImpl.java
@@ -0,0 +1,192 @@
+/*******************************************************************************
+
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ ******************************************************************************/
+package org.apache.arrow.vector.complex.impl;
+
+import java.util.Map;
+
+import org.apache.arrow.vector.ValueVector;
+import org.apache.arrow.vector.complex.RepeatedMapVector;
+import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.complex.writer.BaseWriter.MapWriter;
+import org.apache.arrow.vector.holders.RepeatedMapHolder;
+import org.apache.arrow.vector.types.Types.MajorType;
+
+import com.google.common.collect.Maps;
+
+@SuppressWarnings("unused")
+public class RepeatedMapReaderImpl extends AbstractFieldReader{
+  private static final int NO_VALUES = Integer.MAX_VALUE - 1;
+
+  private final RepeatedMapVector vector;
+  private final Map<String, FieldReader> fields = Maps.newHashMap();
+
+  public RepeatedMapReaderImpl(RepeatedMapVector vector) {
+    this.vector = vector;
+  }
+
+  private void setChildrenPosition(int index) {
+    for (FieldReader r : fields.values()) {
+      r.setPosition(index);
+    }
+  }
+
+  @Override
+  public FieldReader reader(String name) {
+    FieldReader reader = fields.get(name);
+    if (reader == null) {
+      ValueVector child = vector.getChild(name);
+      if (child == null) {
+        reader = NullReader.INSTANCE;
+      } else {
+        reader = child.getReader();
+      }
+      fields.put(name, reader);
+      reader.setPosition(currentOffset);
+    }
+    return reader;
+  }
+
+  @Override
+  public FieldReader reader() {
+    if (currentOffset == NO_VALUES) {
+      return NullReader.INSTANCE;
+    }
+
+    setChildrenPosition(currentOffset);
+    return new SingleLikeRepeatedMapReaderImpl(vector, this);
+  }
+
+  private int currentOffset;
+  private int maxOffset;
+
+  @Override
+  public void reset() {
+    super.reset();
+    currentOffset = 0;
+    maxOffset = 0;
+    for (FieldReader reader:fields.values()) {
+      reader.reset();
+    }
+    fields.clear();
+  }
+
+  @Override
+  public int size() {
+    if (isNull()) {
+      return 0;
+    }
+    return maxOffset - (currentOffset < 0 ? 0 : currentOffset);
+  }
+
+  @Override
+  public void setPosition(int index) {
+    if (index < 0 || index == NO_VALUES) {
+      currentOffset = NO_VALUES;
+      return;
+    }
+
+    super.setPosition(index);
+    RepeatedMapHolder h = new RepeatedMapHolder();
+    vector.getAccessor().get(index, h);
+    if (h.start == h.end) {
+      currentOffset = NO_VALUES;
+    } else {
+      currentOffset = h.start-1;
+      maxOffset = h.end;
+      setChildrenPosition(currentOffset);
+    }
+  }
+
+  public void setSinglePosition(int index, int childIndex) {
+    super.setPosition(index);
+    RepeatedMapHolder h = new RepeatedMapHolder();
+    vector.getAccessor().get(index, h);
+    if (h.start == h.end) {
+      currentOffset = NO_VALUES;
+    } else {
+      int singleOffset = h.start + childIndex;
+      assert singleOffset < h.end;
+      currentOffset = singleOffset;
+      maxOffset = singleOffset + 1;
+      setChildrenPosition(singleOffset);
+    }
+  }
+
+  @Override
+  public boolean next() {
+    if (currentOffset +1 < maxOffset) {
+      setChildrenPosition(++currentOffset);
+      return true;
+    } else {
+      currentOffset = NO_VALUES;
+      return false;
+    }
+  }
+
+  public boolean isNull() {
+    return currentOffset == NO_VALUES;
+  }
+
+  @Override
+  public Object readObject() {
+    return vector.getAccessor().getObject(idx());
+  }
+
+  @Override
+  public MajorType getType() {
+    return vector.getField().getType();
+  }
+
+  @Override
+  public java.util.Iterator<String> iterator() {
+    return vector.fieldNameIterator();
+  }
+
+  @Override
+  public boolean isSet() {
+    return true;
+  }
+
+  @Override
+  public void copyAsValue(MapWriter writer) {
+    if (currentOffset == NO_VALUES) {
+      return;
+    }
+    RepeatedMapWriter impl = (RepeatedMapWriter) writer;
+    impl.container.copyFromSafe(idx(), impl.idx(), vector);
+  }
+
+  public void copyAsValueSingle(MapWriter writer) {
+    if (currentOffset == NO_VALUES) {
+      return;
+    }
+    SingleMapWriter impl = (SingleMapWriter) writer;
+    impl.container.copyFromSafe(currentOffset, impl.idx(), vector);
+  }
+
+  @Override
+  public void copyAsField(String name, MapWriter writer) {
+    if (currentOffset == NO_VALUES) {
+      return;
+    }
+    RepeatedMapWriter impl = (RepeatedMapWriter) writer.map(name);
+    impl.container.copyFromSafe(idx(), impl.idx(), vector);
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/SingleLikeRepeatedMapReaderImpl.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/SingleLikeRepeatedMapReaderImpl.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/SingleLikeRepeatedMapReaderImpl.java
new file mode 100644
index 0000000..086d26e
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/SingleLikeRepeatedMapReaderImpl.java
@@ -0,0 +1,89 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.vector.complex.impl;
+
+import java.util.Iterator;
+
+import org.apache.arrow.vector.complex.RepeatedMapVector;
+import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.complex.writer.BaseWriter.MapWriter;
+import org.apache.arrow.vector.types.Types;
+import org.apache.arrow.vector.types.Types.MajorType;
+import org.apache.arrow.vector.types.Types.MinorType;
+
+public class SingleLikeRepeatedMapReaderImpl extends AbstractFieldReader{
+
+  private RepeatedMapReaderImpl delegate;
+
+  public SingleLikeRepeatedMapReaderImpl(RepeatedMapVector vector, FieldReader delegate) {
+    this.delegate = (RepeatedMapReaderImpl) delegate;
+  }
+
+  @Override
+  public int size() {
+    throw new UnsupportedOperationException("You can't call size on a single map reader.");
+  }
+
+  @Override
+  public boolean next() {
+    throw new UnsupportedOperationException("You can't call next on a single map reader.");
+  }
+
+  @Override
+  public MajorType getType() {
+    return Types.required(MinorType.MAP);
+  }
+
+
+  @Override
+  public void copyAsValue(MapWriter writer) {
+    delegate.copyAsValueSingle(writer);
+  }
+
+  public void copyAsValueSingle(MapWriter writer){
+    delegate.copyAsValueSingle(writer);
+  }
+
+  @Override
+  public FieldReader reader(String name) {
+    return delegate.reader(name);
+  }
+
+  @Override
+  public void setPosition(int index) {
+    delegate.setPosition(index);
+  }
+
+  @Override
+  public Object readObject() {
+    return delegate.readObject();
+  }
+
+  @Override
+  public Iterator<String> iterator() {
+    return delegate.iterator();
+  }
+
+  @Override
+  public boolean isSet() {
+    return ! delegate.isNull();
+  }
+
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/SingleListReaderImpl.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/SingleListReaderImpl.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/SingleListReaderImpl.java
new file mode 100644
index 0000000..f16f628
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/SingleListReaderImpl.java
@@ -0,0 +1,88 @@
+
+/*******************************************************************************
+
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ ******************************************************************************/
+package org.apache.arrow.vector.complex.impl;
+
+
+import org.apache.arrow.vector.complex.AbstractContainerVector;
+import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.complex.writer.BaseWriter.ListWriter;
+import org.apache.arrow.vector.complex.writer.BaseWriter.MapWriter;
+import org.apache.arrow.vector.types.Types;
+import org.apache.arrow.vector.types.Types.MajorType;
+import org.apache.arrow.vector.types.Types.MinorType;
+
+@SuppressWarnings("unused")
+public class SingleListReaderImpl extends AbstractFieldReader{
+
+  private static final MajorType TYPE = Types.optional(MinorType.LIST);
+  private final String name;
+  private final AbstractContainerVector container;
+  private FieldReader reader;
+
+  public SingleListReaderImpl(String name, AbstractContainerVector container) {
+    super();
+    this.name = name;
+    this.container = container;
+  }
+
+  @Override
+  public MajorType getType() {
+    return TYPE;
+  }
+
+
+  @Override
+  public void setPosition(int index) {
+    super.setPosition(index);
+    if (reader != null) {
+      reader.setPosition(index);
+    }
+  }
+
+  @Override
+  public Object readObject() {
+    return reader.readObject();
+  }
+
+  @Override
+  public FieldReader reader() {
+    if (reader == null) {
+      reader = container.getChild(name).getReader();
+      setPosition(idx());
+    }
+    return reader;
+  }
+
+  @Override
+  public boolean isSet() {
+    return false;
+  }
+
+  @Override
+  public void copyAsValue(ListWriter writer) {
+    throw new UnsupportedOperationException("Generic list copying not yet supported.  Please resolve to typed list.");
+  }
+
+  @Override
+  public void copyAsField(String name, MapWriter writer) {
+    throw new UnsupportedOperationException("Generic list copying not yet supported.  Please resolve to typed list.");
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/SingleMapReaderImpl.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/SingleMapReaderImpl.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/SingleMapReaderImpl.java
new file mode 100644
index 0000000..84b9980
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/SingleMapReaderImpl.java
@@ -0,0 +1,108 @@
+
+
+/*******************************************************************************
+
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ ******************************************************************************/
+package org.apache.arrow.vector.complex.impl;
+
+
+import java.util.Map;
+
+import org.apache.arrow.vector.ValueVector;
+import org.apache.arrow.vector.complex.MapVector;
+import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.complex.writer.BaseWriter.MapWriter;
+import org.apache.arrow.vector.types.Types.MajorType;
+
+import com.google.common.collect.Maps;
+
+@SuppressWarnings("unused")
+public class SingleMapReaderImpl extends AbstractFieldReader{
+
+  private final MapVector vector;
+  private final Map<String, FieldReader> fields = Maps.newHashMap();
+
+  public SingleMapReaderImpl(MapVector vector) {
+    this.vector = vector;
+  }
+
+  private void setChildrenPosition(int index){
+    for(FieldReader r : fields.values()){
+      r.setPosition(index);
+    }
+  }
+
+  @Override
+  public FieldReader reader(String name){
+    FieldReader reader = fields.get(name);
+    if(reader == null){
+      ValueVector child = vector.getChild(name);
+      if(child == null){
+        reader = NullReader.INSTANCE;
+      }else{
+        reader = child.getReader();
+      }
+      fields.put(name, reader);
+      reader.setPosition(idx());
+    }
+    return reader;
+  }
+
+  @Override
+  public void setPosition(int index){
+    super.setPosition(index);
+    for(FieldReader r : fields.values()){
+      r.setPosition(index);
+    }
+  }
+
+  @Override
+  public Object readObject() {
+    return vector.getAccessor().getObject(idx());
+  }
+
+  @Override
+  public boolean isSet() {
+    return true;
+  }
+
+  @Override
+  public MajorType getType(){
+    return vector.getField().getType();
+  }
+
+  @Override
+  public java.util.Iterator<String> iterator(){
+    return vector.fieldNameIterator();
+  }
+
+  @Override
+  public void copyAsValue(MapWriter writer){
+    SingleMapWriter impl = (SingleMapWriter) writer;
+    impl.container.copyFromSafe(idx(), impl.idx(), vector);
+  }
+
+  @Override
+  public void copyAsField(String name, MapWriter writer){
+    SingleMapWriter impl = (SingleMapWriter) writer.map(name);
+    impl.container.copyFromSafe(idx(), impl.idx(), vector);
+  }
+
+
+}
+

http://git-wip-us.apache.org/repos/asf/arrow/blob/fa5f0299/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/UnionListReader.java
----------------------------------------------------------------------
diff --git a/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/UnionListReader.java b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/UnionListReader.java
new file mode 100644
index 0000000..9b54d02
--- /dev/null
+++ b/java/vector/src/main/java/org/apache/arrow/vector/complex/impl/UnionListReader.java
@@ -0,0 +1,98 @@
+/*******************************************************************************
+
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ ******************************************************************************/
+package org.apache.arrow.vector.complex.impl;
+
+import org.apache.arrow.vector.UInt4Vector;
+import org.apache.arrow.vector.ValueVector;
+import org.apache.arrow.vector.complex.ListVector;
+import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.complex.writer.BaseWriter.ListWriter;
+import org.apache.arrow.vector.complex.writer.FieldWriter;
+import org.apache.arrow.vector.holders.UnionHolder;
+import org.apache.arrow.vector.types.Types.DataMode;
+import org.apache.arrow.vector.types.Types.MajorType;
+import org.apache.arrow.vector.types.Types.MinorType;
+
+public class UnionListReader extends AbstractFieldReader {
+
+  private ListVector vector;
+  private ValueVector data;
+  private UInt4Vector offsets;
+
+  public UnionListReader(ListVector vector) {
+    this.vector = vector;
+    this.data = vector.getDataVector();
+    this.offsets = vector.getOffsetVector();
+  }
+
+  @Override
+  public boolean isSet() {
+    return true;
+  }
+
+  MajorType type = new MajorType(MinorType.LIST, DataMode.OPTIONAL);
+
+  public MajorType getType() {
+    return type;
+  }
+
+  private int currentOffset;
+  private int maxOffset;
+
+  @Override
+  public void setPosition(int index) {
+    super.setPosition(index);
+    currentOffset = offsets.getAccessor().get(index) - 1;
+    maxOffset = offsets.getAccessor().get(index + 1);
+  }
+
+  @Override
+  public FieldReader reader() {
+    return data.getReader();
+  }
+
+  @Override
+  public Object readObject() {
+    return vector.getAccessor().getObject(idx());
+  }
+
+  @Override
+  public void read(int index, UnionHolder holder) {
+    setPosition(idx());
+    for (int i = -1; i < index; i++) {
+      next();
+    }
+    holder.reader = data.getReader();
+    holder.isSet = data.getReader().isSet() ? 1 : 0;
+  }
+
+  @Override
+  public boolean next() {
+    if (currentOffset + 1 < maxOffset) {
+      data.getReader().setPosition(++currentOffset);
+      return true;
+    } else {
+      return false;
+    }
+  }
+
+  public void copyAsValue(ListWriter writer) {
+    ComplexCopier.copy(this, (FieldWriter) writer);
+  }
+}