You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Jason Altekruse (JIRA)" <ji...@apache.org> on 2015/01/09 00:29:34 UTC

[jira] [Commented] (DRILL-1965) Expand read and write testing for parquet across all supported types

    [ https://issues.apache.org/jira/browse/DRILL-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14270207#comment-14270207 ] 

Jason Altekruse commented on DRILL-1965:
----------------------------------------

Two methods for doing this have been explored. It would be useful if we had a human editable format for test writers to write input files and baselines themselves. JSON is the easiest format currently, so I explored writing a JSON file with numerics or strings that could be cast into all supported types. The patch attached is the initial effort on this work, the interval types do not appear to be casting correctly as they are specified right now, I think its just a formatting problem, but I am going to had this off to Ramana for further work and generating the parquet files.

The patch also includes some code to generate a physical plan that uses the mock-scan operator, which was a workaround I was trying before I realized the unsigned types were not fully implemented (there are references to them in the code, but they can not be casted to and are not currently supported). This did reveal some shortcomings in the generateTestData method of several of the value vector types like date and timestamp.

> Expand read and write testing for parquet across all supported types
> --------------------------------------------------------------------
>
>                 Key: DRILL-1965
>                 URL: https://issues.apache.org/jira/browse/DRILL-1965
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Jason Altekruse
>            Assignee: Jason Altekruse
>
> The additional types we added to the parquet spec to allow use of parquet as a general purpose export format for drill query results have not all been thoroughly tested, we should make a better set of tests to ensure that the read and write path for the types are all working properly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)