You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Sean Hsuan-Yi Chu (JIRA)" <ji...@apache.org> on 2015/03/16 18:35:39 UTC

[jira] [Commented] (DRILL-1911) Querying same field multiple times with different case would hit memory leak and return incorrect result.

    [ https://issues.apache.org/jira/browse/DRILL-1911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14363557#comment-14363557 ] 

Sean Hsuan-Yi Chu commented on DRILL-1911:
------------------------------------------

The issues DRILL-2311, 1911, 1943 are all related. 

This solution is 

In ProjectRecordBatch, even if a column from incoming recordbatch does not need to be classified, the output name for this column still needs to bel ensured to be unique.

> Querying same field multiple times with different case would hit memory leak and return incorrect result. 
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-1911
>                 URL: https://issues.apache.org/jira/browse/DRILL-1911
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>            Reporter: Jinfeng Ni
>            Assignee: Jinfeng Ni
>             Fix For: 0.8.0
>
>
> git.commit.id.abbrev=309e1be
> If query the same field twice, with different case, Drill will throw memory assertion error. 
>  select employee_id, Employee_id from cp.`employee.json` limit 2;
> +-------------+
> | employee_id |
> +-------------+
> | 1           |
> | 2           |
> Query failed: Query failed: Failure while running fragment., Attempted to close accountor with 2 buffer(s) still allocatedfor QueryId: 2b5cc8eb-2817-aadb-e0fa-49272796592a, MajorFragmentId: 0, MinorFragmentId: 0.
>      Total 1 allocation(s) of byte size(s): 4096, at stack location:
>           org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.buffer(TopLevelAllocator.java:212)
>           org.apache.drill.exec.vector.UInt1Vector.allocateNewSafe(UInt1Vector.java:137)
>           org.apache.drill.exec.vector.NullableBigIntVector.allocateNewSafe(NullableBigIntVector.java:173)
>           org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doAlloc(ProjectRecordBatch.java:229)
>           org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork(ProjectRecordBatch.java:167)
>           org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:93)
>           org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:132)
>           org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>           org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>           org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:67)
>           org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:97)
>           org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:57)
>           org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:114)
>           org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:254)
>           java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>           java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>           java.lang.Thread.run(Thread.java:744)
> Also, notice that the query result only contains one field; the second field is missing. 
> The plan looks fine.
> Drill Physical : 
> 00-00    Screen: rowcount = 463.0, cumulative cost = {1900.3 rows, 996.3 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 103
> 00-01      Project(employee_id=[$0], Employee_id=[$1]): rowcount = 463.0, cumulative cost = {1854.0 rows, 950.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 102
> 00-02        SelectionVectorRemover: rowcount = 463.0, cumulative cost = {1391.0 rows, 942.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 101
> 00-03          Limit(fetch=[2]): rowcount = 463.0, cumulative cost = {928.0 rows, 479.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 100
> 00-04            Project(employee_id=[$0], Employee_id=[$0]): rowcount = 463.0, cumulative cost = {926.0 rows, 471.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 99
> 00-05              Scan(groupscan=[EasyGroupScan [selectionRoot=/employee.json, numFiles=1, columns=[`employee_id`], files=[/employee.json]]]): rowcount = 463.0, cumulative cost = {463.0 rows, 463.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 98



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)