You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by "David Chen (JIRA)" <ji...@apache.org> on 2014/03/22 02:48:43 UTC

[jira] [Updated] (TAJO-30) Parquet Integration

     [ https://issues.apache.org/jira/browse/TAJO-30?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Chen updated TAJO-30:
---------------------------

    Description: 
Parquet is a columnar storage format developed by Twitter. Implement Parquet (http://parquet.io/) support for Tajo.

The implementation consists of the following:

 * {{TajoParquetReader}} and {{TajoParquetWriter}} - Top-level reader and writer for serializing/deserializing to Tajo Tuples.
 * {{TajoReadSupport}} and {{TajoWriteSupport}} - Abstractions to perform conversion between Parquet and Tajo records.
 * {{TajoRecordMaterializer}} - Materializes Tajo Tuples from Parquet's internal representation.
 * {{TajoRecordConverter}} - Used by {{TajoRecordMateriailzer}} to materialize a Tajo Tuple.
 * {{TajoSchemaConverter}} - Converts between Tajo and Parquet schemas.

  was:
Parquet is very promising file format developed by twitter. We need to investigate the applicability of Parquet. If possible, we implement Parquet port.

http://parquet.io/


> Parquet Integration
> -------------------
>
>                 Key: TAJO-30
>                 URL: https://issues.apache.org/jira/browse/TAJO-30
>             Project: Tajo
>          Issue Type: New Feature
>            Reporter: Hyunsik Choi
>            Assignee: David Chen
>              Labels: Parquet
>
> Parquet is a columnar storage format developed by Twitter. Implement Parquet (http://parquet.io/) support for Tajo.
> The implementation consists of the following:
>  * {{TajoParquetReader}} and {{TajoParquetWriter}} - Top-level reader and writer for serializing/deserializing to Tajo Tuples.
>  * {{TajoReadSupport}} and {{TajoWriteSupport}} - Abstractions to perform conversion between Parquet and Tajo records.
>  * {{TajoRecordMaterializer}} - Materializes Tajo Tuples from Parquet's internal representation.
>  * {{TajoRecordConverter}} - Used by {{TajoRecordMateriailzer}} to materialize a Tajo Tuple.
>  * {{TajoSchemaConverter}} - Converts between Tajo and Parquet schemas.



--
This message was sent by Atlassian JIRA
(v6.2#6252)