You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Steven Phillips (JIRA)" <ji...@apache.org> on 2015/04/13 20:58:12 UTC

[jira] [Updated] (DRILL-2758) Memory leak in parquet writer when writing billions of records

     [ https://issues.apache.org/jira/browse/DRILL-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Steven Phillips updated DRILL-2758:
-----------------------------------
    Fix Version/s: 0.9.0

> Memory leak in parquet writer when writing billions of records
> --------------------------------------------------------------
>
>                 Key: DRILL-2758
>                 URL: https://issues.apache.org/jira/browse/DRILL-2758
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet
>    Affects Versions: 0.8.0
>            Reporter: Aman Sinha
>            Assignee: Steven Phillips
>             Fix For: 0.9.0
>
>
>  Encountered the following memory leak when running a CTAS creating parquet data.  This is on a large data set, so I cannot provide the reproduction here.  Several billions of records are written and all fragments except 1 show FINIISHED state in the profile.  The 1 fragment that is in RUNNING state shows 0 records written.  The jstack on that node showed no activity and cpu was idle.  However, drillbit.log shows the following memory leak:  
> {code}
> 2015-04-10 18:11:40,337 [2ad7efdb-37ad-743e-3fe4-54a9d158d843:frag:1:40] WARN  o.a.d.e.w.fragment.FragmentExecutor - Failure while closing out resources
> java.lang.IllegalStateException: Attempted to close accountor with 1 buffer(s) still allocatedfor QueryId: 2ad7efdb-37ad-743e-3fe4-54a9d158d843, MajorFragmentId: 1, MinorFragmen
> tId: 40.
>         Total 1 allocation(s) of byte size(s): 1153433, at stack location:
>                 org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.buffer(TopLevelAllocator.java:234)
>                 org.apache.drill.exec.store.parquet.ParquetDirectByteBufferAllocator.allocate(ParquetDirectByteBufferAllocator.java:45)
>                 parquet.bytes.CapacityByteArrayOutputStream.allocateSlab(CapacityByteArrayOutputStream.java:74)
>                 parquet.bytes.CapacityByteArrayOutputStream.initSlabs(CapacityByteArrayOutputStream.java:88)
>                 parquet.bytes.CapacityByteArrayOutputStream.<init>(CapacityByteArrayOutputStream.java:69)
>                 parquet.column.values.plain.PlainValuesWriter.<init>(PlainValuesWriter.java:48)
>                 parquet.column.ParquetProperties.getValuesWriter(ParquetProperties.java:109)
>                 parquet.column.impl.ColumnWriterImpl.<init>(ColumnWriterImpl.java:81)
>                 parquet.column.impl.ColumnWriteStoreImpl.newMemColumn(ColumnWriteStoreImpl.java:68)
>                 parquet.column.impl.ColumnWriteStoreImpl.getColumnWriter(ColumnWriteStoreImpl.java:56)
>                 parquet.io.MessageColumnIO$MessageColumnIORecordConsumer.<init>(MessageColumnIO.java:124)
>                 parquet.io.MessageColumnIO.getRecordWriter(MessageColumnIO.java:315)
>                 org.apache.drill.exec.store.parquet.ParquetRecordWriter.newSchema(ParquetRecordWriter.java:165)
>                 org.apache.drill.exec.store.parquet.ParquetRecordWriter.updateSchema(ParquetRecordWriter.java:141)
>                 org.apache.drill.exec.physical.impl.WriterRecordBatch.setupNewSchema(WriterRecordBatch.java:162)
>                 org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext(WriterRecordBatch.java:113)
>                 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>                 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>                 org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:68)
>                 org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:99)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)