You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Jacques Nadeau (JIRA)" <ji...@apache.org> on 2014/07/25 19:21:38 UTC
[jira] [Commented] (DRILL-1161) Drill Parquet writer fails with an
Out of Memory issue when the data is large enough
[ https://issues.apache.org/jira/browse/DRILL-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074614#comment-14074614 ]
Jacques Nadeau commented on DRILL-1161:
---------------------------------------
Issue here is that ParquetWriter tries to maintain the entire row group on heap. We need to update Parquet writer to maintain row groups off heap.
> Drill Parquet writer fails with an Out of Memory issue when the data is large enough
> ------------------------------------------------------------------------------------
>
> Key: DRILL-1161
> URL: https://issues.apache.org/jira/browse/DRILL-1161
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - Parquet, Storage - Writer
> Reporter: Rahul Challapalli
> Attachments: error.log
>
>
> git.commit.id.abbrev=e5c2da0
> The below query fails with an out of memory issue :
> create table `wide-columns-100000` as select columns[0] col0, cast(columns[1] as int) col1 from `wide-columns-100000.tbl`;
> The source file contains 100000 records. Each record has 2 columns. The first column is a string with 100000 characters in it and the second is an integer. Adding a limit to the above query succeeds. I attached the error messages from drillbit.log and drillbit.out
> Let me know if you need anything more
--
This message was sent by Atlassian JIRA
(v6.2#6252)