You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Nitin Pawar <ni...@gmail.com> on 2017/04/03 05:59:51 UTC
Re: Help for DRILL-3609
Hi Aman,
I have committed my code at https://github.com/nitinpawar/drill
When I say that results are crossing partition boundary, i meant when i say
partition by department_id, for department_id = 1 partitionProcessor
considers records from department_id=2
Here is the result of sample query. If you see for last record output is
null but 2 records before the last record should have been null as have set
the offset to 3 in lead function. I am trying to find the location in code
where copyNext does not copy records from next partition.
0: jdbc:drill:zk=local> select department_id, salary, lead(salary,3) over
(partition by department_id order by salary asc) from cp.`employee.json`
limit 20;
+----------------+----------+----------+
| department_id | salary | EXPR$2 |
+----------------+----------+----------+
| 1 | 30000.0 | 35000.0 |
| 1 | 35000.0 | 40000.0 |
| 1 | 35000.0 | 40000.0 |
| 1 | 35000.0 | 80000.0 |
| 1 | 40000.0 | 6700.0 |
| 1 | 40000.0 | 8000.0 |
| 1 | 80000.0 | null |
| 2 | 6700.0 | 10000.0 |
| 2 | 8000.0 | 25000.0 |
| 2 | 10000.0 | 5000.0 |
| 2 | 10000.0 | 8500.0 |
| 2 | 25000.0 | null |
| 3 | 5000.0 | 45000.0 |
| 3 | 8500.0 | 5000.0 |
| 3 | 15000.0 | 6700.0 |
| 3 | 45000.0 | null |
| 4 | 5000.0 | 5000.0 |
| 4 | 6700.0 | null |
| 5 | 5000.0 | 5000.0 |
| 5 | 5000.0 | 6500.0 |
+----------------+----------+----------+
On Sat, Apr 1, 2017 at 5:07 AM, Aman Sinha <as...@mapr.com> wrote:
> Hi Nitin,
> When you say ‘it is crossing the partition boundary’, it’s not clear what
> precisely are you are referring to. Window function operator semantics
> are somewhat complex, so pls clarify.
> Usually it is more effective to put your investigation and even a link to
> your github branch (whatever progress you have made) in the JIRA itself.
> Please include the query that you are trying to run. This will give more
> context to someone to provide an answer to your question.
>
> -Aman
>
> On 3/30/17, 11:59 PM, "Nitin Pawar" <ni...@gmail.com> wrote:
>
> anyone who can spare 10-15 minutes ?
>
> Thanks,
> Nitin
>
> On Mon, Mar 27, 2017 at 3:56 PM, Nitin Pawar <ni...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I am working DRILL-3609 <https://issues.apache.org/
> jira/browse/DRILL-3609>
> >
> > Right now I have been able to change the hard coded offset to the
> value
> > user inputs.
> > I have successfully ran the query.
> >
> > I am currently stuck where it is crossing the partition boundary. In
> the
> > current implementation It is copying values across boundaries
> instead of
> > returning null.
> >
> > Can any dev have some spare time like 10-15 mins to help me identify
> where
> > I have to make the changes??
> >
> > Thanks,
> > Nitin Pawar
> >
>
>
>
> --
> Nitin Pawar
>
>
>
--
Nitin Pawar
Re: Help for DRILL-3609
Posted by Nitin Pawar <ni...@gmail.com>.
Hi devs,
can someone help me on this ?
Thanks,
Nitin
On Mon, Apr 3, 2017 at 11:29 AM, Nitin Pawar <ni...@gmail.com>
wrote:
> Hi Aman,
>
> I have committed my code at https://github.com/nitinpawar/drill
>
> When I say that results are crossing partition boundary, i meant when i
> say partition by department_id, for department_id = 1 partitionProcessor
> considers records from department_id=2
>
> Here is the result of sample query. If you see for last record output is
> null but 2 records before the last record should have been null as have set
> the offset to 3 in lead function. I am trying to find the location in code
> where copyNext does not copy records from next partition.
>
>
> 0: jdbc:drill:zk=local> select department_id, salary, lead(salary,3) over
> (partition by department_id order by salary asc) from cp.`employee.json`
> limit 20;
> +----------------+----------+----------+
> | department_id | salary | EXPR$2 |
> +----------------+----------+----------+
> | 1 | 30000.0 | 35000.0 |
> | 1 | 35000.0 | 40000.0 |
> | 1 | 35000.0 | 40000.0 |
> | 1 | 35000.0 | 80000.0 |
> | 1 | 40000.0 | 6700.0 |
> | 1 | 40000.0 | 8000.0 |
> | 1 | 80000.0 | null |
> | 2 | 6700.0 | 10000.0 |
> | 2 | 8000.0 | 25000.0 |
> | 2 | 10000.0 | 5000.0 |
> | 2 | 10000.0 | 8500.0 |
> | 2 | 25000.0 | null |
> | 3 | 5000.0 | 45000.0 |
> | 3 | 8500.0 | 5000.0 |
> | 3 | 15000.0 | 6700.0 |
> | 3 | 45000.0 | null |
> | 4 | 5000.0 | 5000.0 |
> | 4 | 6700.0 | null |
> | 5 | 5000.0 | 5000.0 |
> | 5 | 5000.0 | 6500.0 |
> +----------------+----------+----------+
>
> On Sat, Apr 1, 2017 at 5:07 AM, Aman Sinha <as...@mapr.com> wrote:
>
>> Hi Nitin,
>> When you say ‘it is crossing the partition boundary’, it’s not clear what
>> precisely are you are referring to. Window function operator semantics
>> are somewhat complex, so pls clarify.
>> Usually it is more effective to put your investigation and even a link to
>> your github branch (whatever progress you have made) in the JIRA itself.
>> Please include the query that you are trying to run. This will give
>> more context to someone to provide an answer to your question.
>>
>> -Aman
>>
>> On 3/30/17, 11:59 PM, "Nitin Pawar" <ni...@gmail.com> wrote:
>>
>> anyone who can spare 10-15 minutes ?
>>
>> Thanks,
>> Nitin
>>
>> On Mon, Mar 27, 2017 at 3:56 PM, Nitin Pawar <nitinpawar432@gmail.com
>> >
>> wrote:
>>
>> > Hi,
>> >
>> > I am working DRILL-3609 <https://issues.apache.org/jir
>> a/browse/DRILL-3609>
>> >
>> > Right now I have been able to change the hard coded offset to the
>> value
>> > user inputs.
>> > I have successfully ran the query.
>> >
>> > I am currently stuck where it is crossing the partition boundary.
>> In the
>> > current implementation It is copying values across boundaries
>> instead of
>> > returning null.
>> >
>> > Can any dev have some spare time like 10-15 mins to help me
>> identify where
>> > I have to make the changes??
>> >
>> > Thanks,
>> > Nitin Pawar
>> >
>>
>>
>>
>> --
>> Nitin Pawar
>>
>>
>>
>
>
> --
> Nitin Pawar
>
--
Nitin Pawar