You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@drill.apache.org by "Jacques Nadeau (JIRA)" <ji...@apache.org> on 2014/07/25 19:21:38 UTC

[jira] [Commented] (DRILL-1161) Drill Parquet writer fails with an Out of Memory issue when the data is large enough

    [ https://issues.apache.org/jira/browse/DRILL-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074614#comment-14074614 ] 

Jacques Nadeau commented on DRILL-1161:
---------------------------------------

Issue here is that ParquetWriter tries to maintain the entire row group on heap.  We need to update Parquet writer to maintain row groups off heap.

> Drill Parquet writer fails with an Out of Memory issue when the data is large enough
> ------------------------------------------------------------------------------------
>
>                 Key: DRILL-1161
>                 URL: https://issues.apache.org/jira/browse/DRILL-1161
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet, Storage - Writer
>            Reporter: Rahul Challapalli
>         Attachments: error.log
>
>
> git.commit.id.abbrev=e5c2da0
> The below query fails with an out of memory issue :
> create table `wide-columns-100000` as select columns[0] col0, cast(columns[1] as int) col1 from `wide-columns-100000.tbl`;
> The source file contains 100000 records. Each record has 2 columns. The first column is a string with 100000 characters in it and the second is an integer. Adding a limit to the above query succeeds. I attached the error messages from drillbit.log and drillbit.out
> Let me know if you need anything more



--
This message was sent by Atlassian JIRA
(v6.2#6252)