You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Sergio Peña (JIRA)" <ji...@apache.org> on 2015/06/22 18:44:00 UTC

[jira] [Created] (PARQUET-315) Add PARQUET_1_0 and non-repeated data performance tests to parquet-benchmarks

Sergio Peña created PARQUET-315:
-----------------------------------

             Summary: Add PARQUET_1_0 and non-repeated data performance tests to parquet-benchmarks
                 Key: PARQUET-315
                 URL: https://issues.apache.org/jira/browse/PARQUET-315
             Project: Parquet
          Issue Type: Test
          Components: parquet-mr
            Reporter: Sergio Peña
            Priority: Minor


The current parquet-benchmarks module run some performance tests between different block & page sizes for PARQUET_2_0 version only. We should run some tests with PARQUET_1_0 version as well in order to get a view about new parquet version enhancements, and be able to catch possible overheads early by comparing with the old file format.

Also, this module uses repeated data to benchmark the settings. We should also use random data to get different results about how current and new encodings work with real world data. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)