You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/09/24 12:15:00 UTC

[jira] [Commented] (PARQUET-1399) Move parquet-mr related code from parquet-format

    [ https://issues.apache.org/jira/browse/PARQUET-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16625731#comment-16625731 ] 

ASF GitHub Bot commented on PARQUET-1399:
-----------------------------------------

gszadovszky closed pull request #517: PARQUET-1399: Move parquet-mr related code from parquet-format
URL: https://github.com/apache/parquet-mr/pull/517
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/parquet-avro/pom.xml b/parquet-avro/pom.xml
index 3592121d7..bc3603fe6 100644
--- a/parquet-avro/pom.xml
+++ b/parquet-avro/pom.xml
@@ -45,8 +45,8 @@
     </dependency>
     <dependency>
       <groupId>org.apache.parquet</groupId>
-      <artifactId>parquet-format</artifactId>
-      <version>${parquet.format.version}</version>
+      <artifactId>parquet-format-structures</artifactId>
+      <version>${project.version}</version>
     </dependency>
     <dependency>
       <groupId>org.apache.avro</groupId>
diff --git a/parquet-common/pom.xml b/parquet-common/pom.xml
index e7b2446a6..f9a60a94b 100644
--- a/parquet-common/pom.xml
+++ b/parquet-common/pom.xml
@@ -38,8 +38,8 @@
   <dependencies>
     <dependency>
       <groupId>org.apache.parquet</groupId>
-      <artifactId>parquet-format</artifactId>
-      <version>${parquet.format.version}</version>
+      <artifactId>parquet-format-structures</artifactId>
+      <version>${project.version}</version>
     </dependency>
 
     <dependency>
diff --git a/parquet-format-structures/pom.xml b/parquet-format-structures/pom.xml
new file mode 100644
index 000000000..e69cced3b
--- /dev/null
+++ b/parquet-format-structures/pom.xml
@@ -0,0 +1,206 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  - Licensed to the Apache Software Foundation (ASF) under one
+  - or more contributor license agreements.  See the NOTICE file
+  - distributed with this work for additional information
+  - regarding copyright ownership.  The ASF licenses this file
+  - to you under the Apache License, Version 2.0 (the
+  - "License"); you may not use this file except in compliance
+  - with the License.  You may obtain a copy of the License at
+  -
+  -   http://www.apache.org/licenses/LICENSE-2.0
+  -
+  - Unless required by applicable law or agreed to in writing,
+  - software distributed under the License is distributed on an
+  - "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  - KIND, either express or implied.  See the License for the
+  - specific language governing permissions and limitations
+  - under the License.
+  -->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+  <modelVersion>4.0.0</modelVersion>
+
+  <parent>
+    <groupId>org.apache.parquet</groupId>
+    <artifactId>parquet</artifactId>
+    <relativePath>../pom.xml</relativePath>
+    <version>1.10.1-SNAPSHOT</version>
+  </parent>
+
+  <artifactId>parquet-format-structures</artifactId>
+  <packaging>jar</packaging>
+
+  <name>Apache Parquet Format Structures</name>
+  <url>http://parquet.apache.org/</url>
+  <description>Parquet-mr related java classes to use the parquet-format thrift structures.</description>
+
+  <properties>
+    <parquet.thrift.path>${project.build.directory}/parquet-format-thrift</parquet.thrift.path>
+  </properties>
+
+  <build>
+    <plugins>
+      <!-- Getting the parquet-format thrift file -->
+       <plugin>
+         <groupId>org.apache.maven.plugins</groupId>
+         <artifactId>maven-dependency-plugin</artifactId>
+         <executions>
+           <execution>
+             <id>unpack</id>
+             <phase>generate-sources</phase>
+             <goals>
+               <goal>unpack</goal>
+             </goals>
+             <configuration>
+               <artifactItems>
+                 <artifactItem>
+                   <groupId>org.apache.parquet</groupId>
+                   <artifactId>parquet-format</artifactId>
+                   <version>${parquet.format.version}</version>
+                   <type>jar</type>
+                 </artifactItem>
+               </artifactItems>
+               <includes>parquet.thrift</includes>
+               <outputDirectory>${parquet.thrift.path}</outputDirectory>
+             </configuration>
+           </execution>
+         </executions>
+       </plugin>
+      <!-- thrift -->
+      <plugin>
+        <groupId>org.apache.thrift.tools</groupId>
+        <artifactId>maven-thrift-plugin</artifactId>
+        <version>0.1.11</version>
+        <configuration>
+          <thriftSourceRoot>${parquet.thrift.path}</thriftSourceRoot>
+          <thriftExecutable>${format.thrift.executable}</thriftExecutable>
+        </configuration>
+        <executions>
+          <execution>
+            <id>thrift-sources</id>
+            <phase>generate-sources</phase>
+            <goals>
+              <goal>compile</goal>
+            </goals>
+          </execution>
+        </executions>
+      </plugin>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-shade-plugin</artifactId>
+        <executions>
+          <execution>
+            <phase>package</phase>
+            <goals>
+              <goal>shade</goal>
+            </goals>
+            <configuration>
+              <keepDependenciesWithProvidedScope>true</keepDependenciesWithProvidedScope>
+              <artifactSet>
+                <includes>
+                  <include>org.apache.thrift:libthrift</include>
+                </includes>
+              </artifactSet>
+              <filters>
+                <filter>
+                  <!-- Sigh. The Thrift jar contains its source -->
+                  <artifact>org.apache.thrift:libthrift</artifact>
+                  <excludes>
+                    <exclude>**/*.java</exclude>
+                    <exclude>META-INF/LICENSE.txt</exclude>
+                    <exclude>META-INF/NOTICE.txt</exclude>
+                  </excludes>
+                </filter>
+              </filters>
+              <relocations>
+                <relocation>
+                  <pattern>org.apache.thrift</pattern>
+                  <shadedPattern>${shade.prefix}.org.apache.thrift</shadedPattern>
+                </relocation>
+              </relocations>
+            </configuration>
+          </execution>
+        </executions>
+      </plugin>
+      <!-- Configure build/javadoc as well to support "mvn javadoc:javadoc" -->
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-javadoc-plugin</artifactId>
+        <configuration>
+          <!-- We have to turn off the javadoc check because thrift generates improper comments -->
+          <additionalparam>-Xdoclint:none</additionalparam>
+        </configuration>
+      </plugin>
+    </plugins>
+  </build>
+
+  <reports>
+    <plugins>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-javadoc-plugin</artifactId>
+        <configuration>
+          <!-- We have to turn off the javadoc check because thrift generates improper comments -->
+          <additionalparam>-Xdoclint:none</additionalparam>
+        </configuration>
+      </plugin>
+    </plugins>
+  </reports>
+
+  <dependencies>
+    <dependency>
+      <groupId>org.slf4j</groupId>
+      <artifactId>slf4j-api</artifactId>
+      <version>${slf4j.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.thrift</groupId>
+      <artifactId>libthrift</artifactId>
+      <version>${format.thrift.version}</version>
+    </dependency>
+  </dependencies>
+
+  <profiles>
+    <profile>
+      <activation>
+        <os>
+          <family>!windows</family>
+        </os>
+      </activation>
+      <id>UnixClassOS</id>
+      <build>
+        <plugins>
+          <plugin>
+            <groupId>org.codehaus.mojo</groupId>
+            <artifactId>exec-maven-plugin</artifactId>
+            <version>1.2.1</version>
+            <executions>
+              <execution>
+                <id>check-thrift-version</id>
+                <phase>generate-sources</phase>
+                <goals>
+                  <goal>exec</goal>
+                </goals>
+                <configuration>
+                  <executable>sh</executable>
+                  <workingDirectory>${basedir}</workingDirectory>
+                  <arguments>
+                    <argument>-c</argument>
+                    <argument>${thrift.executable} -version | fgrep 'Thrift version ${thrift.version}' &amp;&amp; exit 0;
+                      echo "=================================================================================";
+                      echo "========== [FATAL] Build is configured to require Thrift version ${thrift.version} ==========";
+                      echo -n "========== Currently installed: ";
+                      ${thrift.executable} -version;
+                      echo "=================================================================================";
+                      exit 1
+                    </argument>
+                  </arguments>
+                </configuration>
+              </execution>
+            </executions>
+          </plugin>
+        </plugins>
+      </build>
+    </profile>
+  </profiles>
+</project>
diff --git a/parquet-format-structures/src/main/java/org/apache/parquet/format/InterningProtocol.java b/parquet-format-structures/src/main/java/org/apache/parquet/format/InterningProtocol.java
new file mode 100644
index 000000000..a405d4f87
--- /dev/null
+++ b/parquet-format-structures/src/main/java/org/apache/parquet/format/InterningProtocol.java
@@ -0,0 +1,231 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.parquet.format;
+
+import java.nio.ByteBuffer;
+
+import org.apache.thrift.TException;
+import org.apache.thrift.protocol.TField;
+import org.apache.thrift.protocol.TList;
+import org.apache.thrift.protocol.TMap;
+import org.apache.thrift.protocol.TMessage;
+import org.apache.thrift.protocol.TProtocol;
+import org.apache.thrift.protocol.TSet;
+import org.apache.thrift.protocol.TStruct;
+import org.apache.thrift.transport.TTransport;
+
+/**
+ * TProtocol that interns the strings.
+ */
+public class InterningProtocol extends TProtocol {
+
+  private final TProtocol delegate;
+
+  public InterningProtocol(TProtocol delegate) {
+    super(delegate.getTransport());
+    this.delegate = delegate;
+  }
+
+  public TTransport getTransport() {
+    return delegate.getTransport();
+  }
+
+  public void writeMessageBegin(TMessage message) throws TException {
+    delegate.writeMessageBegin(message);
+  }
+
+  public void writeMessageEnd() throws TException {
+    delegate.writeMessageEnd();
+  }
+
+  public int hashCode() {
+    return delegate.hashCode();
+  }
+
+  public void writeStructBegin(TStruct struct) throws TException {
+    delegate.writeStructBegin(struct);
+  }
+
+  public void writeStructEnd() throws TException {
+    delegate.writeStructEnd();
+  }
+
+  public void writeFieldBegin(TField field) throws TException {
+    delegate.writeFieldBegin(field);
+  }
+
+  public void writeFieldEnd() throws TException {
+    delegate.writeFieldEnd();
+  }
+
+  public void writeFieldStop() throws TException {
+    delegate.writeFieldStop();
+  }
+
+  public void writeMapBegin(TMap map) throws TException {
+    delegate.writeMapBegin(map);
+  }
+
+  public void writeMapEnd() throws TException {
+    delegate.writeMapEnd();
+  }
+
+  public void writeListBegin(TList list) throws TException {
+    delegate.writeListBegin(list);
+  }
+
+  public void writeListEnd() throws TException {
+    delegate.writeListEnd();
+  }
+
+  public void writeSetBegin(TSet set) throws TException {
+    delegate.writeSetBegin(set);
+  }
+
+  public void writeSetEnd() throws TException {
+    delegate.writeSetEnd();
+  }
+
+  public void writeBool(boolean b) throws TException {
+    delegate.writeBool(b);
+  }
+
+  public void writeByte(byte b) throws TException {
+    delegate.writeByte(b);
+  }
+
+  public void writeI16(short i16) throws TException {
+    delegate.writeI16(i16);
+  }
+
+  public void writeI32(int i32) throws TException {
+    delegate.writeI32(i32);
+  }
+
+  public void writeI64(long i64) throws TException {
+    delegate.writeI64(i64);
+  }
+
+  public void writeDouble(double dub) throws TException {
+    delegate.writeDouble(dub);
+  }
+
+  public void writeString(String str) throws TException {
+    delegate.writeString(str);
+  }
+
+  public void writeBinary(ByteBuffer buf) throws TException {
+    delegate.writeBinary(buf);
+  }
+
+  public TMessage readMessageBegin() throws TException {
+    return delegate.readMessageBegin();
+  }
+
+  public void readMessageEnd() throws TException {
+    delegate.readMessageEnd();
+  }
+
+  public TStruct readStructBegin() throws TException {
+    return delegate.readStructBegin();
+  }
+
+  public void readStructEnd() throws TException {
+    delegate.readStructEnd();
+  }
+
+  public TField readFieldBegin() throws TException {
+    return delegate.readFieldBegin();
+  }
+
+  public void readFieldEnd() throws TException {
+    delegate.readFieldEnd();
+  }
+
+  public TMap readMapBegin() throws TException {
+    return delegate.readMapBegin();
+  }
+
+  public void readMapEnd() throws TException {
+    delegate.readMapEnd();
+  }
+
+  public TList readListBegin() throws TException {
+    return delegate.readListBegin();
+  }
+
+  public void readListEnd() throws TException {
+    delegate.readListEnd();
+  }
+
+  public TSet readSetBegin() throws TException {
+    return delegate.readSetBegin();
+  }
+
+  public void readSetEnd() throws TException {
+    delegate.readSetEnd();
+  }
+
+  public boolean equals(Object obj) {
+    return delegate.equals(obj);
+  }
+
+  public boolean readBool() throws TException {
+    return delegate.readBool();
+  }
+
+  public byte readByte() throws TException {
+    return delegate.readByte();
+  }
+
+  public short readI16() throws TException {
+    return delegate.readI16();
+  }
+
+  public int readI32() throws TException {
+    return delegate.readI32();
+  }
+
+  public long readI64() throws TException {
+    return delegate.readI64();
+  }
+
+  public double readDouble() throws TException {
+    return delegate.readDouble();
+  }
+
+  public String readString() throws TException {
+    // this is where we intern the strings
+    return delegate.readString().intern();
+  }
+
+  public ByteBuffer readBinary() throws TException {
+    return delegate.readBinary();
+  }
+
+  public void reset() {
+    delegate.reset();
+  }
+
+  public String toString() {
+    return delegate.toString();
+  }
+
+}
diff --git a/parquet-format-structures/src/main/java/org/apache/parquet/format/LogicalTypes.java b/parquet-format-structures/src/main/java/org/apache/parquet/format/LogicalTypes.java
new file mode 100644
index 000000000..7c63e41da
--- /dev/null
+++ b/parquet-format-structures/src/main/java/org/apache/parquet/format/LogicalTypes.java
@@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.parquet.format;
+
+/**
+ * Convenience instances of logical type classes.
+ */
+public class LogicalTypes {
+  public static class TimeUnits {
+    public static final TimeUnit MILLIS = TimeUnit.MILLIS(new MilliSeconds());
+    public static final TimeUnit MICROS = TimeUnit.MICROS(new MicroSeconds());
+  }
+
+  public static LogicalType DECIMAL(int scale, int precision) {
+    return LogicalType.DECIMAL(new DecimalType(scale, precision));
+  }
+
+  public static final LogicalType UTF8 = LogicalType.STRING(new StringType());
+  public static final LogicalType MAP  = LogicalType.MAP(new MapType());
+  public static final LogicalType LIST = LogicalType.LIST(new ListType());
+  public static final LogicalType ENUM = LogicalType.ENUM(new EnumType());
+  public static final LogicalType DATE = LogicalType.DATE(new DateType());
+  public static final LogicalType TIME_MILLIS = LogicalType.TIME(new TimeType(true, TimeUnits.MILLIS));
+  public static final LogicalType TIME_MICROS = LogicalType.TIME(new TimeType(true, TimeUnits.MICROS));
+  public static final LogicalType TIMESTAMP_MILLIS = LogicalType.TIMESTAMP(new TimestampType(true, TimeUnits.MILLIS));
+  public static final LogicalType TIMESTAMP_MICROS = LogicalType.TIMESTAMP(new TimestampType(true, TimeUnits.MICROS));
+  public static final LogicalType INT_8 = LogicalType.INTEGER(new IntType((byte) 8, true));
+  public static final LogicalType INT_16 = LogicalType.INTEGER(new IntType((byte) 16, true));
+  public static final LogicalType INT_32 = LogicalType.INTEGER(new IntType((byte) 32, true));
+  public static final LogicalType INT_64 = LogicalType.INTEGER(new IntType((byte) 64, true));
+  public static final LogicalType UINT_8 = LogicalType.INTEGER(new IntType((byte) 8, false));
+  public static final LogicalType UINT_16 = LogicalType.INTEGER(new IntType((byte) 16, false));
+  public static final LogicalType UINT_32 = LogicalType.INTEGER(new IntType((byte) 32, false));
+  public static final LogicalType UINT_64 = LogicalType.INTEGER(new IntType((byte) 64, false));
+  public static final LogicalType UNKNOWN = LogicalType.UNKNOWN(new NullType());
+  public static final LogicalType JSON = LogicalType.JSON(new JsonType());
+  public static final LogicalType BSON = LogicalType.BSON(new BsonType());
+}
diff --git a/parquet-format-structures/src/main/java/org/apache/parquet/format/Util.java b/parquet-format-structures/src/main/java/org/apache/parquet/format/Util.java
new file mode 100644
index 000000000..d09d007a2
--- /dev/null
+++ b/parquet-format-structures/src/main/java/org/apache/parquet/format/Util.java
@@ -0,0 +1,236 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.parquet.format;
+
+import static org.apache.parquet.format.FileMetaData._Fields.CREATED_BY;
+import static org.apache.parquet.format.FileMetaData._Fields.KEY_VALUE_METADATA;
+import static org.apache.parquet.format.FileMetaData._Fields.NUM_ROWS;
+import static org.apache.parquet.format.FileMetaData._Fields.ROW_GROUPS;
+import static org.apache.parquet.format.FileMetaData._Fields.SCHEMA;
+import static org.apache.parquet.format.FileMetaData._Fields.VERSION;
+import static org.apache.parquet.format.event.Consumers.fieldConsumer;
+import static org.apache.parquet.format.event.Consumers.listElementsOf;
+import static org.apache.parquet.format.event.Consumers.listOf;
+import static org.apache.parquet.format.event.Consumers.struct;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.util.List;
+
+import org.apache.thrift.TBase;
+import org.apache.thrift.TException;
+import org.apache.thrift.protocol.TCompactProtocol;
+import org.apache.thrift.protocol.TProtocol;
+import org.apache.thrift.transport.TIOStreamTransport;
+
+import org.apache.parquet.format.event.Consumers.Consumer;
+import org.apache.parquet.format.event.Consumers.DelegatingFieldConsumer;
+import org.apache.parquet.format.event.EventBasedThriftReader;
+import org.apache.parquet.format.event.TypedConsumer.I32Consumer;
+import org.apache.parquet.format.event.TypedConsumer.I64Consumer;
+import org.apache.parquet.format.event.TypedConsumer.StringConsumer;
+
+/**
+ * Utility to read/write metadata
+ * We use the TCompactProtocol to serialize metadata
+ */
+public class Util {
+
+  public static void writeColumnIndex(ColumnIndex columnIndex, OutputStream to) throws IOException {
+    write(columnIndex, to);
+  }
+
+  public static ColumnIndex readColumnIndex(InputStream from) throws IOException {
+    return read(from, new ColumnIndex());
+  }
+
+  public static void writeOffsetIndex(OffsetIndex offsetIndex, OutputStream to) throws IOException {
+    write(offsetIndex, to);
+  }
+
+  public static OffsetIndex readOffsetIndex(InputStream from) throws IOException {
+    return read(from, new OffsetIndex());
+  }
+
+  public static void writePageHeader(PageHeader pageHeader, OutputStream to) throws IOException {
+    write(pageHeader, to);
+  }
+
+  public static PageHeader readPageHeader(InputStream from) throws IOException {
+    return read(from, new PageHeader());
+  }
+
+  public static void writeFileMetaData(org.apache.parquet.format.FileMetaData fileMetadata, OutputStream to) throws IOException {
+    write(fileMetadata, to);
+  }
+
+  public static FileMetaData readFileMetaData(InputStream from) throws IOException {
+    return read(from, new FileMetaData());
+  }
+  /**
+   * reads the meta data from the stream
+   * @param from the stream to read the metadata from
+   * @param skipRowGroups whether row groups should be skipped
+   * @return the resulting metadata
+   * @throws IOException if any I/O error occurs during the reading
+   */
+  public static FileMetaData readFileMetaData(InputStream from, boolean skipRowGroups) throws IOException {
+    FileMetaData md = new FileMetaData();
+    if (skipRowGroups) {
+      readFileMetaData(from, new DefaultFileMetaDataConsumer(md), skipRowGroups);
+    } else {
+      read(from, md);
+    }
+    return md;
+  }
+
+  /**
+   * To read metadata in a streaming fashion.
+   *
+   */
+  public static abstract class FileMetaDataConsumer {
+    abstract public void setVersion(int version);
+    abstract public void setSchema(List<SchemaElement> schema);
+    abstract public void setNumRows(long numRows);
+    abstract public void addRowGroup(RowGroup rowGroup);
+    abstract public void addKeyValueMetaData(KeyValue kv);
+    abstract public void setCreatedBy(String createdBy);
+  }
+
+  /**
+   * Simple default consumer that sets the fields
+   *
+   */
+  public static final class DefaultFileMetaDataConsumer extends FileMetaDataConsumer {
+    private final FileMetaData md;
+
+    public DefaultFileMetaDataConsumer(FileMetaData md) {
+      this.md = md;
+    }
+
+    @Override
+    public void setVersion(int version) {
+      md.setVersion(version);
+    }
+
+    @Override
+    public void setSchema(List<SchemaElement> schema) {
+      md.setSchema(schema);
+    }
+
+    @Override
+    public void setNumRows(long numRows) {
+      md.setNum_rows(numRows);
+    }
+
+    @Override
+    public void setCreatedBy(String createdBy) {
+      md.setCreated_by(createdBy);
+    }
+
+    @Override
+    public void addRowGroup(RowGroup rowGroup) {
+      md.addToRow_groups(rowGroup);
+    }
+
+    @Override
+    public void addKeyValueMetaData(KeyValue kv) {
+      md.addToKey_value_metadata(kv);
+    }
+  }
+
+  public static void readFileMetaData(InputStream from, FileMetaDataConsumer consumer) throws IOException {
+    readFileMetaData(from, consumer, false);
+  }
+
+  public static void readFileMetaData(InputStream from, final FileMetaDataConsumer consumer, boolean skipRowGroups) throws IOException {
+    try {
+      DelegatingFieldConsumer eventConsumer = fieldConsumer()
+      .onField(VERSION, new I32Consumer() {
+        @Override
+        public void consume(int value) {
+          consumer.setVersion(value);
+        }
+      }).onField(SCHEMA, listOf(SchemaElement.class, new Consumer<List<SchemaElement>>() {
+        @Override
+        public void consume(List<SchemaElement> schema) {
+          consumer.setSchema(schema);
+        }
+      })).onField(NUM_ROWS, new I64Consumer() {
+        @Override
+        public void consume(long value) {
+          consumer.setNumRows(value);
+        }
+      }).onField(KEY_VALUE_METADATA, listElementsOf(struct(KeyValue.class, new Consumer<KeyValue>() {
+        @Override
+        public void consume(KeyValue kv) {
+          consumer.addKeyValueMetaData(kv);
+        }
+      }))).onField(CREATED_BY, new StringConsumer() {
+        @Override
+        public void consume(String value) {
+          consumer.setCreatedBy(value);
+        }
+      });
+      if (!skipRowGroups) {
+        eventConsumer = eventConsumer.onField(ROW_GROUPS, listElementsOf(struct(RowGroup.class, new Consumer<RowGroup>() {
+          @Override
+          public void consume(RowGroup rowGroup) {
+            consumer.addRowGroup(rowGroup);
+          }
+        })));
+      }
+      new EventBasedThriftReader(protocol(from)).readStruct(eventConsumer);
+
+    } catch (TException e) {
+      throw new IOException("can not read FileMetaData: " + e.getMessage(), e);
+    }
+  }
+
+  private static TProtocol protocol(OutputStream to) {
+    return protocol(new TIOStreamTransport(to));
+  }
+
+  private static TProtocol protocol(InputStream from) {
+    return protocol(new TIOStreamTransport(from));
+  }
+
+  private static InterningProtocol protocol(TIOStreamTransport t) {
+    return new InterningProtocol(new TCompactProtocol(t));
+  }
+
+  private static <T extends TBase<?,?>> T read(InputStream from, T tbase) throws IOException {
+    try {
+      tbase.read(protocol(from));
+      return tbase;
+    } catch (TException e) {
+      throw new IOException("can not read " + tbase.getClass() + ": " + e.getMessage(), e);
+    }
+  }
+
+  private static void write(TBase<?, ?> tbase, OutputStream to) throws IOException {
+    try {
+      tbase.write(protocol(to));
+    } catch (TException e) {
+      throw new IOException("can not write " + tbase, e);
+    }
+  }
+}
diff --git a/parquet-format-structures/src/main/java/org/apache/parquet/format/event/Consumers.java b/parquet-format-structures/src/main/java/org/apache/parquet/format/event/Consumers.java
new file mode 100644
index 000000000..ef87997e7
--- /dev/null
+++ b/parquet-format-structures/src/main/java/org/apache/parquet/format/event/Consumers.java
@@ -0,0 +1,193 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.parquet.format.event;
+
+import static java.util.Collections.unmodifiableMap;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.thrift.TBase;
+import org.apache.thrift.TException;
+import org.apache.thrift.TFieldIdEnum;
+import org.apache.thrift.protocol.TList;
+import org.apache.thrift.protocol.TProtocol;
+import org.apache.thrift.protocol.TProtocolUtil;
+
+import org.apache.parquet.format.event.Consumers.Consumer;
+import org.apache.parquet.format.event.TypedConsumer.ListConsumer;
+import org.apache.parquet.format.event.TypedConsumer.StructConsumer;
+
+/**
+ * Entry point for reading thrift in a streaming fashion
+ */
+public class Consumers {
+
+  /**
+   * To consume objects coming from a DelegatingFieldConsumer
+   *
+   * @param <T> the type of consumed objects
+   */
+  public static interface Consumer<T> {
+    void consume(T t);
+  }
+
+  /**
+   * Delegates reading the field to TypedConsumers.
+   * There is one TypedConsumer per thrift type.
+   * use {@link #onField(TFieldIdEnum, TypedConsumer)} et al. to consume specific thrift fields.
+   * @see Consumers#fieldConsumer()
+   */
+  public static class DelegatingFieldConsumer implements FieldConsumer {
+
+    private final Map<Short, TypedConsumer> contexts;
+    private final FieldConsumer defaultFieldEventConsumer;
+
+    private DelegatingFieldConsumer(FieldConsumer defaultFieldEventConsumer, Map<Short, TypedConsumer> contexts) {
+      this.defaultFieldEventConsumer = defaultFieldEventConsumer;
+      this.contexts = unmodifiableMap(contexts);
+    }
+
+    private DelegatingFieldConsumer() {
+      this(new SkippingFieldConsumer());
+    }
+
+    private DelegatingFieldConsumer(FieldConsumer defaultFieldEventConsumer) {
+      this(defaultFieldEventConsumer, Collections.<Short, TypedConsumer>emptyMap());
+    }
+
+    public DelegatingFieldConsumer onField(TFieldIdEnum e, TypedConsumer typedConsumer) {
+      Map<Short, TypedConsumer> newContexts = new HashMap<Short, TypedConsumer>(contexts);
+      newContexts.put(e.getThriftFieldId(), typedConsumer);
+      return new DelegatingFieldConsumer(defaultFieldEventConsumer, newContexts);
+    }
+
+    @Override
+    public void consumeField(
+        TProtocol protocol, EventBasedThriftReader reader,
+        short id, byte type) throws TException {
+      TypedConsumer delegate = contexts.get(id);
+      if (delegate != null) {
+        delegate.read(protocol, reader, type);
+      } else {
+        defaultFieldEventConsumer.consumeField(protocol, reader, id, type);
+      }
+    }
+  }
+
+  /**
+   * call onField on the resulting DelegatingFieldConsumer to handle individual fields
+   * @return a new DelegatingFieldConsumer
+   */
+  public static DelegatingFieldConsumer fieldConsumer() {
+    return new DelegatingFieldConsumer();
+  }
+
+  /**
+   * To consume a list of elements
+   * @param c the class of the list content
+   * @param consumer the consumer that will receive the list
+   * @param <T> the type of the list content
+   * @return a ListConsumer that can be passed to the DelegatingFieldConsumer
+   */
+  public static <T extends TBase<T,? extends TFieldIdEnum>> ListConsumer listOf(Class<T> c, final Consumer<List<T>> consumer) {
+    class ListConsumer implements Consumer<T> {
+      List<T> list;
+      @Override
+      public void consume(T t) {
+        list.add(t);
+      }
+    }
+    final ListConsumer co = new ListConsumer();
+    return new DelegatingListElementsConsumer(struct(c, co)) {
+      @Override
+      public void consumeList(TProtocol protocol,
+          EventBasedThriftReader reader, TList tList) throws TException {
+        co.list = new ArrayList<T>();
+        super.consumeList(protocol, reader, tList);
+        consumer.consume(co.list);
+      }
+    };
+  }
+
+  /**
+   * To consume list elements one by one
+   * @param consumer the consumer that will read the elements
+   * @return a ListConsumer that can be passed to the DelegatingFieldConsumer
+   */
+  public static ListConsumer listElementsOf(TypedConsumer consumer) {
+    return new DelegatingListElementsConsumer(consumer);
+  }
+
+  public static <T extends TBase<T,? extends TFieldIdEnum>> StructConsumer struct(final Class<T> c, final Consumer<T> consumer) {
+    return new TBaseStructConsumer<T>(c, consumer);
+  }
+}
+
+class SkippingFieldConsumer implements FieldConsumer {
+  @Override
+  public void consumeField(TProtocol protocol, EventBasedThriftReader reader, short id, byte type) throws TException {
+    TProtocolUtil.skip(protocol, type);
+  }
+}
+
+class DelegatingListElementsConsumer extends ListConsumer {
+
+  private TypedConsumer elementConsumer;
+
+  protected DelegatingListElementsConsumer(TypedConsumer consumer) {
+    this.elementConsumer = consumer;
+  }
+
+  @Override
+  public void consumeElement(TProtocol protocol, EventBasedThriftReader reader, byte elemType) throws TException {
+    elementConsumer.read(protocol, reader, elemType);
+  }
+}
+class TBaseStructConsumer<T extends TBase<T, ? extends TFieldIdEnum>> extends StructConsumer {
+
+  private final Class<T> c;
+  private Consumer<T> consumer;
+
+  public TBaseStructConsumer(Class<T> c, Consumer<T> consumer) {
+    this.c = c;
+    this.consumer = consumer;
+  }
+
+  @Override
+  public void consumeStruct(TProtocol protocol, EventBasedThriftReader reader) throws TException {
+    T o = newObject();
+    o.read(protocol);
+    consumer.consume(o);
+  }
+
+  protected T newObject() {
+    try {
+      return c.newInstance();
+    } catch (InstantiationException e) {
+      throw new RuntimeException(c.getName(), e);
+    } catch (IllegalAccessException e) {
+      throw new RuntimeException(c.getName(), e);
+    }
+  }
+
+}
\ No newline at end of file
diff --git a/parquet-format-structures/src/main/java/org/apache/parquet/format/event/EventBasedThriftReader.java b/parquet-format-structures/src/main/java/org/apache/parquet/format/event/EventBasedThriftReader.java
new file mode 100644
index 000000000..2fb9cf651
--- /dev/null
+++ b/parquet-format-structures/src/main/java/org/apache/parquet/format/event/EventBasedThriftReader.java
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.parquet.format.event;
+
+import org.apache.thrift.TException;
+import org.apache.thrift.protocol.TField;
+import org.apache.thrift.protocol.TList;
+import org.apache.thrift.protocol.TMap;
+import org.apache.thrift.protocol.TProtocol;
+import org.apache.thrift.protocol.TSet;
+import org.apache.thrift.protocol.TType;
+
+import org.apache.parquet.format.event.TypedConsumer.ListConsumer;
+import org.apache.parquet.format.event.TypedConsumer.MapConsumer;
+import org.apache.parquet.format.event.TypedConsumer.SetConsumer;
+
+/**
+ * Event based reader for Thrift
+ */
+public final class EventBasedThriftReader {
+
+  private final TProtocol protocol;
+
+  /**
+   * @param protocol the protocol to read from
+   */
+  public EventBasedThriftReader(TProtocol protocol) {
+    this.protocol = protocol;
+  }
+
+  /**
+   * reads a Struct from the underlying protocol and passes the field events to the FieldConsumer
+   * @param c the field consumer
+   * @throws TException if any thrift related error occurs during the reading
+   */
+  public void readStruct(FieldConsumer c) throws TException {
+    protocol.readStructBegin();
+    readStructContent(c);
+    protocol.readStructEnd();
+  }
+
+  /**
+   * reads the content of a struct (fields) from the underlying protocol and passes the events to c
+   * @param c the field consumer
+   * @throws TException if any thrift related error occurs during the reading
+   */
+  public void readStructContent(FieldConsumer c) throws TException {
+    TField field;
+    while (true) {
+      field = protocol.readFieldBegin();
+      if (field.type == TType.STOP) {
+        break;
+      }
+      c.consumeField(protocol, this, field.id, field.type);
+    }
+  }
+
+  /**
+   * reads the set content (elements) from the underlying protocol and passes the events to the set event consumer
+   * @param eventConsumer the consumer
+   * @param tSet the set descriptor
+   * @throws TException if any thrift related error occurs during the reading
+   */
+  public void readSetContent(SetConsumer eventConsumer, TSet tSet)
+      throws TException {
+    for (int i = 0; i < tSet.size; i++) {
+      eventConsumer.consumeElement(protocol, this, tSet.elemType);
+    }
+  }
+
+  /**
+   * reads the map content (key values) from the underlying protocol and passes the events to the map event consumer
+   * @param eventConsumer the consumer
+   * @param tMap the map descriptor
+   * @throws TException if any thrift related error occurs during the reading
+   */
+  public void readMapContent(MapConsumer eventConsumer, TMap tMap)
+      throws TException {
+    for (int i = 0; i < tMap.size; i++) {
+      eventConsumer.consumeEntry(protocol, this, tMap.keyType, tMap.valueType);
+    }
+  }
+
+  /**
+   * reads a key-value pair
+   * @param keyType the type of the key
+   * @param keyConsumer the consumer for the key
+   * @param valueType the type of the value
+   * @param valueConsumer the consumer for the value
+   * @throws TException if any thrift related error occurs during the reading
+   */
+  public void readMapEntry(byte keyType, TypedConsumer keyConsumer, byte valueType, TypedConsumer valueConsumer)
+      throws TException {
+    keyConsumer.read(protocol, this, keyType);
+    valueConsumer.read(protocol, this, valueType);
+  }
+
+  /**
+   * reads the list content (elements) from the underlying protocol and passes the events to the list event consumer
+   * @param eventConsumer the consumer
+   * @param tList the list descriptor
+   * @throws TException if any thrift related error occurs during the reading
+   */
+  public void readListContent(ListConsumer eventConsumer, TList tList)
+      throws TException {
+    for (int i = 0; i < tList.size; i++) {
+      eventConsumer.consumeElement(protocol, this, tList.elemType);
+    }
+  }
+}
\ No newline at end of file
diff --git a/parquet-format-structures/src/main/java/org/apache/parquet/format/event/FieldConsumer.java b/parquet-format-structures/src/main/java/org/apache/parquet/format/event/FieldConsumer.java
new file mode 100644
index 000000000..6656934b6
--- /dev/null
+++ b/parquet-format-structures/src/main/java/org/apache/parquet/format/event/FieldConsumer.java
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.parquet.format.event;
+
+import org.apache.thrift.TException;
+import org.apache.thrift.protocol.TProtocol;
+
+/**
+ * To receive Thrift field events
+ */
+public interface FieldConsumer {
+
+  /**
+   * called by the EventBasedThriftReader when reading a field from a Struct
+   * @param protocol the underlying protocol
+   * @param eventBasedThriftReader the reader to delegate to further calls.
+   * @param id the id of the field
+   * @param type the type of the field
+   * @throws TException if any thrift related error occurs during the reading
+   */
+  public void consumeField(TProtocol protocol, EventBasedThriftReader eventBasedThriftReader, short id, byte type) throws TException;
+
+}
\ No newline at end of file
diff --git a/parquet-format-structures/src/main/java/org/apache/parquet/format/event/TypedConsumer.java b/parquet-format-structures/src/main/java/org/apache/parquet/format/event/TypedConsumer.java
new file mode 100644
index 000000000..734449f5e
--- /dev/null
+++ b/parquet-format-structures/src/main/java/org/apache/parquet/format/event/TypedConsumer.java
@@ -0,0 +1,205 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.parquet.format.event;
+
+import static org.apache.thrift.protocol.TType.BOOL;
+import static org.apache.thrift.protocol.TType.BYTE;
+import static org.apache.thrift.protocol.TType.DOUBLE;
+import static org.apache.thrift.protocol.TType.I16;
+import static org.apache.thrift.protocol.TType.I32;
+import static org.apache.thrift.protocol.TType.I64;
+import static org.apache.thrift.protocol.TType.LIST;
+import static org.apache.thrift.protocol.TType.MAP;
+import static org.apache.thrift.protocol.TType.SET;
+import static org.apache.thrift.protocol.TType.STRING;
+import static org.apache.thrift.protocol.TType.STRUCT;
+
+import org.apache.thrift.TException;
+import org.apache.thrift.protocol.TList;
+import org.apache.thrift.protocol.TMap;
+import org.apache.thrift.protocol.TProtocol;
+import org.apache.thrift.protocol.TSet;
+
+/**
+ * receive thrift events of a given type
+ */
+abstract public class TypedConsumer {
+
+  abstract public static class DoubleConsumer extends TypedConsumer {
+    protected DoubleConsumer() { super(DOUBLE); }
+    @Override
+    final void read(TProtocol protocol, EventBasedThriftReader reader) throws TException {
+      this.consume(protocol.readDouble());
+    }
+    abstract public void consume(double value);
+  }
+
+  abstract public static class ByteConsumer extends TypedConsumer {
+    protected ByteConsumer() { super(BYTE); }
+    @Override
+    final void read(TProtocol protocol, EventBasedThriftReader reader) throws TException {
+      this.consume(protocol.readByte());
+    }
+    abstract public void consume(byte value);
+  }
+
+  abstract public static class BoolConsumer extends TypedConsumer {
+    protected BoolConsumer() { super(BOOL); }
+    @Override
+    final void read(TProtocol protocol, EventBasedThriftReader reader) throws TException {
+      this.consume(protocol.readBool());
+    }
+    abstract public void consume(boolean value);
+  }
+
+  abstract public static class I32Consumer extends TypedConsumer {
+    protected I32Consumer() { super(I32); }
+    @Override
+    final void read(TProtocol protocol, EventBasedThriftReader reader) throws TException {
+      this.consume(protocol.readI32());
+    }
+    abstract public void consume(int value);
+  }
+
+  abstract public static class I64Consumer extends TypedConsumer {
+    protected I64Consumer() { super(I64); }
+    final void read(TProtocol protocol, EventBasedThriftReader reader) throws TException {
+      this.consume(protocol.readI64());
+    }
+    abstract public void consume(long value);
+  }
+
+  abstract public static class I16Consumer extends TypedConsumer {
+    protected I16Consumer() { super(I16); }
+    @Override
+    final void read(TProtocol protocol, EventBasedThriftReader reader) throws TException {
+      this.consume(protocol.readI16());
+    }
+    abstract public void consume(short value);
+  }
+
+  abstract public static class StringConsumer extends TypedConsumer {
+    protected StringConsumer() { super(STRING); }
+    @Override
+    final void read(TProtocol protocol, EventBasedThriftReader reader) throws TException {
+      this.consume(protocol.readString());
+    }
+    abstract public void consume(String value);
+  }
+
+  abstract public static class StructConsumer extends TypedConsumer {
+    protected StructConsumer() { super(STRUCT); }
+    @Override
+    final void read(TProtocol protocol, EventBasedThriftReader reader) throws TException {
+      this.consumeStruct(protocol, reader);
+    }
+    /**
+     * can either delegate to the reader or read the struct from the protocol
+     * reader.readStruct(fieldConsumer);
+     * @param protocol the underlying protocol
+     * @param reader the reader to delegate to
+     * @throws TException if any thrift related error occurs during the reading
+     */
+    abstract public void consumeStruct(TProtocol protocol, EventBasedThriftReader reader) throws TException;
+  }
+
+  abstract public static class ListConsumer extends TypedConsumer {
+    protected ListConsumer() { super(LIST); }
+    @Override
+    final void read(TProtocol protocol, EventBasedThriftReader reader) throws TException {
+      this.consumeList(protocol, reader, protocol.readListBegin());
+      protocol.readListEnd();
+    }
+    public void consumeList(TProtocol protocol, EventBasedThriftReader reader, TList tList) throws TException {
+      reader.readListContent(this, tList);
+    }
+    /**
+     * can either delegate to the reader or read the element from the protocol
+     * @param protocol the underlying protocol
+     * @param reader the reader to delegate to
+     * @param elemType the type of the element
+     * @throws TException if any thrift related error occurs during the reading
+     */
+    abstract public void consumeElement(TProtocol protocol, EventBasedThriftReader reader, byte elemType) throws TException;
+  }
+
+  abstract public static class SetConsumer extends TypedConsumer {
+    protected SetConsumer() { super(SET); }
+    @Override
+    final void read(TProtocol protocol, EventBasedThriftReader reader) throws TException {
+      this.consumeSet(protocol, reader, protocol.readSetBegin());
+      protocol.readSetEnd();
+    }
+    public void consumeSet(TProtocol protocol, EventBasedThriftReader reader, TSet tSet) throws TException {
+      reader.readSetContent(this, tSet);
+    }
+    /**
+     * can either delegate to the reader or read the set from the protocol
+     * @param protocol the underlying protocol
+     * @param reader the reader to delegate to
+     * @param elemType the type of the element
+     * @throws TException if any thrift related error occurs during the reading
+     */
+    abstract public void consumeElement(
+        TProtocol protocol, EventBasedThriftReader reader,
+        byte elemType) throws TException;
+  }
+
+  abstract public static class MapConsumer extends TypedConsumer {
+    protected MapConsumer() { super(MAP); }
+    @Override
+    final void read(TProtocol protocol, EventBasedThriftReader reader)
+        throws TException {
+      this.consumeMap(protocol, reader , protocol.readMapBegin());
+      protocol.readMapEnd();
+    }
+    public void consumeMap(TProtocol protocol, EventBasedThriftReader reader, TMap tMap) throws TException {
+      reader.readMapContent(this, tMap);
+    }
+    /**
+     * can either delegate to the reader or read the map entry from the protocol
+     * @param protocol the underlying protocol
+     * @param reader the reader to delegate to
+     * @param keyType the type of the key
+     * @param valueType the type of the value
+     * @throws TException if any thrift related error occurs during the reading
+     */
+    abstract public void consumeEntry(
+        TProtocol protocol, EventBasedThriftReader reader,
+        byte keyType, byte valueType) throws TException;
+  }
+
+  public final byte type;
+
+  private TypedConsumer(byte type) {
+    this.type = type;
+  }
+
+  final public void read(TProtocol protocol, EventBasedThriftReader reader, byte type) throws TException {
+    if (this.type != type) {
+      throw new TException(
+          "Incorrect type in stream. "
+              + "Expected " + this.type
+              + " but got " + type);
+    }
+    this.read(protocol, reader);
+  }
+
+  abstract void read(TProtocol protocol, EventBasedThriftReader reader) throws TException;
+}
\ No newline at end of file
diff --git a/parquet-format-structures/src/test/java/org/apache/parquet/format/TestUtil.java b/parquet-format-structures/src/test/java/org/apache/parquet/format/TestUtil.java
new file mode 100644
index 000000000..1adf0998f
--- /dev/null
+++ b/parquet-format-structures/src/test/java/org/apache/parquet/format/TestUtil.java
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.parquet.format;
+
+import static java.util.Arrays.asList;
+import static junit.framework.Assert.assertEquals;
+import static junit.framework.Assert.assertNull;
+import static org.apache.parquet.format.Util.readFileMetaData;
+import static org.apache.parquet.format.Util.writeFileMetaData;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+
+import org.junit.Test;
+
+import org.apache.parquet.format.Util.DefaultFileMetaDataConsumer;
+public class TestUtil {
+
+  @Test
+  public void testReadFileMetadata() throws Exception {
+    ByteArrayOutputStream baos = new ByteArrayOutputStream();
+    FileMetaData md = new FileMetaData(
+        1,
+        asList(new SchemaElement("foo")),
+        10,
+        asList(
+            new RowGroup(
+                asList(
+                    new ColumnChunk(0),
+                    new ColumnChunk(1)
+                    ),
+                10,
+                5),
+            new RowGroup(
+                asList(
+                    new ColumnChunk(2),
+                    new ColumnChunk(3)
+                    ),
+                11,
+                5)
+        )
+    );
+    writeFileMetaData(md , baos);
+    FileMetaData md2 = readFileMetaData(in(baos));
+    FileMetaData md3 = new FileMetaData();
+    readFileMetaData(in(baos), new DefaultFileMetaDataConsumer(md3));
+    FileMetaData md4 = new FileMetaData();
+    readFileMetaData(in(baos), new DefaultFileMetaDataConsumer(md4), true);
+    FileMetaData md5 = readFileMetaData(in(baos), true);
+    FileMetaData md6 = readFileMetaData(in(baos), false);
+    assertEquals(md, md2);
+    assertEquals(md, md3);
+    assertNull(md4.getRow_groups());
+    assertNull(md5.getRow_groups());
+    assertEquals(md4, md5);
+    md4.setRow_groups(md.getRow_groups());
+    md5.setRow_groups(md.getRow_groups());
+    assertEquals(md, md4);
+    assertEquals(md, md5);
+    assertEquals(md4, md5);
+    assertEquals(md, md6);
+  }
+
+  private ByteArrayInputStream in(ByteArrayOutputStream baos) {
+    return new ByteArrayInputStream(baos.toByteArray());
+  }
+}
diff --git a/parquet-hadoop/pom.xml b/parquet-hadoop/pom.xml
index 98972a235..8d31f7dd0 100644
--- a/parquet-hadoop/pom.xml
+++ b/parquet-hadoop/pom.xml
@@ -43,8 +43,8 @@
     </dependency>
     <dependency>
       <groupId>org.apache.parquet</groupId>
-      <artifactId>parquet-format</artifactId>
-      <version>${parquet.format.version}</version>
+      <artifactId>parquet-format-structures</artifactId>
+      <version>${project.version}</version>
     </dependency>
     <dependency>
       <groupId>org.apache.hadoop</groupId>
diff --git a/parquet-pig/pom.xml b/parquet-pig/pom.xml
index 3b7e5703f..0d3f202c2 100644
--- a/parquet-pig/pom.xml
+++ b/parquet-pig/pom.xml
@@ -48,8 +48,8 @@
     </dependency>
     <dependency>
       <groupId>org.apache.parquet</groupId>
-      <artifactId>parquet-format</artifactId>
-      <version>${parquet.format.version}</version>
+      <artifactId>parquet-format-structures</artifactId>
+      <version>${project.version}</version>
     </dependency>
     <dependency>
       <groupId>org.apache.pig</groupId>
diff --git a/parquet-protobuf/pom.xml b/parquet-protobuf/pom.xml
index b6f4627b1..329046db7 100644
--- a/parquet-protobuf/pom.xml
+++ b/parquet-protobuf/pom.xml
@@ -86,6 +86,17 @@
     </dependency>
   </dependencies>
 
+  <dependencyManagement>
+    <dependencies>
+      <!-- com.twitter.elephantbird brings in an older version of libthrift so we force to use our own one -->
+      <dependency>
+        <groupId>org.apache.thrift</groupId>
+        <artifactId>libthrift</artifactId>
+        <version>${format.thrift.version}</version>
+      </dependency>
+    </dependencies>
+  </dependencyManagement>
+
   <developers>
     <developer>
       <id>lukasnalezenec</id>
diff --git a/parquet-thrift/pom.xml b/parquet-thrift/pom.xml
index 51a6b9b17..4340430b0 100644
--- a/parquet-thrift/pom.xml
+++ b/parquet-thrift/pom.xml
@@ -144,6 +144,17 @@
 
   </dependencies>
 
+  <dependencyManagement>
+    <dependencies>
+      <!-- com.twitter.elephantbird brings in an older version of libthrift so we force to use our own one -->
+      <dependency>
+        <groupId>org.apache.thrift</groupId>
+        <artifactId>libthrift</artifactId>
+        <version>${thrift.version}</version>
+      </dependency>
+    </dependencies>
+  </dependencyManagement>
+
   <build>
     <plugins>
       <plugin>
diff --git a/parquet-tools/pom.xml b/parquet-tools/pom.xml
index 566f8f1c3..32ee4d8ed 100644
--- a/parquet-tools/pom.xml
+++ b/parquet-tools/pom.xml
@@ -48,8 +48,8 @@
   <dependencies>
     <dependency>
       <groupId>org.apache.parquet</groupId>
-      <artifactId>parquet-format</artifactId>
-      <version>${parquet.format.version}</version>
+      <artifactId>parquet-format-structures</artifactId>
+      <version>${project.version}</version>
     </dependency>
     <dependency>
       <groupId>org.apache.parquet</groupId>
diff --git a/pom.xml b/pom.xml
index 7b3f36fe5..4c9d79c12 100644
--- a/pom.xml
+++ b/pom.xml
@@ -84,6 +84,7 @@
     <parquet.format.version>2.4.0</parquet.format.version>
     <previous.version>1.7.0</previous.version>
     <thrift.executable>thrift</thrift.executable>
+    <format.thrift.executable>thrift</format.thrift.executable>
     <scala.version>2.10.6</scala.version>
     <!-- scala.binary.version is used for projects that fetch dependencies that are in scala -->
     <scala.binary.version>2.10</scala.binary.version>
@@ -92,6 +93,7 @@
     <pig.classifier>h2</pig.classifier>
     <thrift-maven-plugin.version>0.10.0</thrift-maven-plugin.version>
     <thrift.version>0.9.3</thrift.version>
+    <format.thrift.version>0.9.3</format.thrift.version>
     <fastutil.version>7.0.13</fastutil.version>
     <semver.api.version>0.9.33</semver.api.version>
     <slf4j.version>1.7.22</slf4j.version>
@@ -117,6 +119,7 @@
     <module>parquet-column</module>
     <module>parquet-common</module>
     <module>parquet-encoding</module>
+    <module>parquet-format-structures</module>
     <module>parquet-generator</module>
     <module>parquet-hadoop</module>
     <module>parquet-jackson</module>
@@ -175,6 +178,11 @@
             </reports>
           </reportSet>
         </reportSets>
+        <configuration>
+          <sourceFileExcludes>
+            <sourceFileExclude>**/generated-sources/**/*.java</sourceFileExclude>
+          </sourceFileExcludes>
+        </configuration>
       </plugin>
       <plugin>
         <groupId>org.codehaus.mojo</groupId>


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Move parquet-mr related code from parquet-format
> ------------------------------------------------
>
>                 Key: PARQUET-1399
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1399
>             Project: Parquet
>          Issue Type: Improvement
>          Components: parquet-mr
>            Reporter: Gabor Szadovszky
>            Assignee: Gabor Szadovszky
>            Priority: Major
>              Labels: pull-request-available
>
> There are java classes in the [parquet-format|https://github.com/apache/parquet-format] repo that shall be in the [parquet-mr|https://github.com/apache/parquet-mr] repo instead: [java classes|https://github.com/apache/parquet-format/tree/master/src/main] and [test classes|https://github.com/apache/parquet-format/tree/master/src/test]
> The idea is to create a separate module in [parquet-mr|https://github.com/apache/parquet-mr] and depend on it instead of depending on [parquet-format|https://github.com/apache/parquet-format]. Only this separate module would depend on [parquet-format|https://github.com/apache/parquet-format] directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)