You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by John Omernik <jo...@omernik.com> on 2018/03/06 14:46:17 UTC
Way to "pivot"
I am not sure if this is the right thing for what I am trying to do, but I
have data in this formate
source dt value
X 2018-03-06 11:00 0.31
X 2018-03-06 12:00 0.94
X 2018-03-06 13:00 0.89
X 2018-03-06 14:00 0.01
X 2018-03-06 15:00 0.43
Y 2018-03-06 11:00 1.43
Y 2018-03-06 12:00 0.50
Y 2018-03-06 13:00 0.10
Y 2018-03-06 14:00 0.42
Y 2018-03-06 15:00 0.41
Z 2018-03-06 11:00 5.34
Z 2018-03-06 12:00 4.32
Z 2018-03-06 13:00 4.20
Z 2018-03-06 14:00 0.89
Z 2018-03-06 15:00 0.01
I'd like to graph it as three lines (X, Y and Z) over time, so the graph
tool I am using asks for it this format:
dt X Y Z
2018-03-06 11:00 0.31 1.43 5.34
2018-03-06 12:00 0.94 0.50 4.32
2018-03-06 13:00 0.89 0.10 4.20
2018-03-06 14:00 0.01 0.42 0.89
2018-03-06 15:00 0.43 0.41 0.01
So I think that would be a PIVOT like function right (which I don't think
Drill has) Is there a way to "fake" this in Drill using some other built in
functions?
Thanks!
John
Re: Way to "pivot"
Posted by John Omernik <jo...@omernik.com>.
Looks as though there is a JIRA, I added some of the notes here to the
comments:
https://issues.apache.org/jira/browse/DRILL-4223
On Tue, Mar 6, 2018 at 1:59 PM, Saurabh Mahapatra <
saurabhmahapatra94@gmail.com> wrote:
> Looks like SQL Server supports it, not sure if this is in the SQL standard:
>
> https://stackoverflow.com/questions/15931607/convert-
> rows-to-columns-using-pivot-in-sql-server
>
>
>
> On Tue, Mar 6, 2018 at 11:47 AM, Kunal Khatua <ku...@gmail.com>
> wrote:
>
> > Not until now :)
> >
> > Can you file a JIRA so that we can track it?
> >
> > On Tue, Mar 6, 2018 at 11:40 AM, John Omernik <jo...@omernik.com> wrote:
> >
> > > Perfect. That works for me because I have a limited number of values,
> I
> > > could see that getting out of hand if the values were unknown. Has
> there
> > > been any talk of a Pivot function in Drill? That would be helpful so
> you
> > > didn't have to know the column names ahead of time.
> > >
> > > John
> > >
> > > On Tue, Mar 6, 2018 at 10:50 AM, Ted Dunning <te...@gmail.com>
> > > wrote:
> > >
> > > > Arjun's approach works even if the timestamps are not unique.
> > Especially
> > > if
> > > > you use avg instead of max.
> > > >
> > > > On Mar 6, 2018 8:47 AM, "Arjun kr" <ar...@outlook.com> wrote:
> > > >
> > > > > If each timestamp has only one set of values for (x,y,z) , you can
> > try
> > > > > something like below.
> > > > >
> > > > > select dt ,
> > > > > max(case when source='X' THEN `value` else 0.0 end) as X,
> > > > > max(case when source='Y' THEN `value` else 0.0 end) as Y,
> > > > > max(case when source='Z' THEN `value` else 0.0 end) as Z
> > > > > from
> > > > > <table>
> > > > > group by dt;
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Arjun
> > > > >
> > > > >
> > > > >
> > > > > ________________________________
> > > > > From: Andries Engelbrecht <ae...@mapr.com>
> > > > > Sent: Tuesday, March 6, 2018 9:11 PM
> > > > > To: user@drill.apache.org
> > > > > Subject: Re: Way to "pivot"
> > > > >
> > > > > If the X, Y and Z is unique for each timestamp you can perhaps use
> > > group
> > > > > by (dt, X, Y , Z) and case to make the X, Y , Z columns. May be
> > worth
> > > > > looking into, but is going to be expensive to execute. Just an
> idea,
> > > but
> > > > > have not tested it.
> > > > >
> > > > > --Andries
> > > > >
> > > > >
> > > > > On 3/6/18, 6:46 AM, "John Omernik" <jo...@omernik.com> wrote:
> > > > >
> > > > > I am not sure if this is the right thing for what I am trying
> to
> > > do,
> > > > > but I
> > > > > have data in this formate
> > > > >
> > > > >
> > > > > source dt value
> > > > > X 2018-03-06 11:00 0.31
> > > > > X 2018-03-06 12:00 0.94
> > > > > X 2018-03-06 13:00 0.89
> > > > > X 2018-03-06 14:00 0.01
> > > > > X 2018-03-06 15:00 0.43
> > > > > Y 2018-03-06 11:00 1.43
> > > > > Y 2018-03-06 12:00 0.50
> > > > > Y 2018-03-06 13:00 0.10
> > > > > Y 2018-03-06 14:00 0.42
> > > > > Y 2018-03-06 15:00 0.41
> > > > > Z 2018-03-06 11:00 5.34
> > > > > Z 2018-03-06 12:00 4.32
> > > > > Z 2018-03-06 13:00 4.20
> > > > > Z 2018-03-06 14:00 0.89
> > > > > Z 2018-03-06 15:00 0.01
> > > > >
> > > > > I'd like to graph it as three lines (X, Y and Z) over time, so
> > the
> > > > > graph
> > > > > tool I am using asks for it this format:
> > > > >
> > > > >
> > > > >
> > > > > dt X Y
> > > > > Z
> > > > >
> > > > > 2018-03-06 11:00 0.31 1.43 5.34
> > > > > 2018-03-06 12:00 0.94 0.50 4.32
> > > > > 2018-03-06 13:00 0.89 0.10 4.20
> > > > > 2018-03-06 14:00 0.01 0.42 0.89
> > > > > 2018-03-06 15:00 0.43 0.41 0.01
> > > > >
> > > > >
> > > > > So I think that would be a PIVOT like function right (which I
> > don't
> > > > > think
> > > > > Drill has) Is there a way to "fake" this in Drill using some
> > other
> > > > > built in
> > > > > functions?
> > > > >
> > > > > Thanks!
> > > > >
> > > > > John
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>
Re: Way to "pivot"
Posted by Saurabh Mahapatra <sa...@gmail.com>.
Looks like SQL Server supports it, not sure if this is in the SQL standard:
https://stackoverflow.com/questions/15931607/convert-rows-to-columns-using-pivot-in-sql-server
On Tue, Mar 6, 2018 at 11:47 AM, Kunal Khatua <ku...@gmail.com> wrote:
> Not until now :)
>
> Can you file a JIRA so that we can track it?
>
> On Tue, Mar 6, 2018 at 11:40 AM, John Omernik <jo...@omernik.com> wrote:
>
> > Perfect. That works for me because I have a limited number of values, I
> > could see that getting out of hand if the values were unknown. Has there
> > been any talk of a Pivot function in Drill? That would be helpful so you
> > didn't have to know the column names ahead of time.
> >
> > John
> >
> > On Tue, Mar 6, 2018 at 10:50 AM, Ted Dunning <te...@gmail.com>
> > wrote:
> >
> > > Arjun's approach works even if the timestamps are not unique.
> Especially
> > if
> > > you use avg instead of max.
> > >
> > > On Mar 6, 2018 8:47 AM, "Arjun kr" <ar...@outlook.com> wrote:
> > >
> > > > If each timestamp has only one set of values for (x,y,z) , you can
> try
> > > > something like below.
> > > >
> > > > select dt ,
> > > > max(case when source='X' THEN `value` else 0.0 end) as X,
> > > > max(case when source='Y' THEN `value` else 0.0 end) as Y,
> > > > max(case when source='Z' THEN `value` else 0.0 end) as Z
> > > > from
> > > > <table>
> > > > group by dt;
> > > >
> > > > Thanks,
> > > >
> > > > Arjun
> > > >
> > > >
> > > >
> > > > ________________________________
> > > > From: Andries Engelbrecht <ae...@mapr.com>
> > > > Sent: Tuesday, March 6, 2018 9:11 PM
> > > > To: user@drill.apache.org
> > > > Subject: Re: Way to "pivot"
> > > >
> > > > If the X, Y and Z is unique for each timestamp you can perhaps use
> > group
> > > > by (dt, X, Y , Z) and case to make the X, Y , Z columns. May be
> worth
> > > > looking into, but is going to be expensive to execute. Just an idea,
> > but
> > > > have not tested it.
> > > >
> > > > --Andries
> > > >
> > > >
> > > > On 3/6/18, 6:46 AM, "John Omernik" <jo...@omernik.com> wrote:
> > > >
> > > > I am not sure if this is the right thing for what I am trying to
> > do,
> > > > but I
> > > > have data in this formate
> > > >
> > > >
> > > > source dt value
> > > > X 2018-03-06 11:00 0.31
> > > > X 2018-03-06 12:00 0.94
> > > > X 2018-03-06 13:00 0.89
> > > > X 2018-03-06 14:00 0.01
> > > > X 2018-03-06 15:00 0.43
> > > > Y 2018-03-06 11:00 1.43
> > > > Y 2018-03-06 12:00 0.50
> > > > Y 2018-03-06 13:00 0.10
> > > > Y 2018-03-06 14:00 0.42
> > > > Y 2018-03-06 15:00 0.41
> > > > Z 2018-03-06 11:00 5.34
> > > > Z 2018-03-06 12:00 4.32
> > > > Z 2018-03-06 13:00 4.20
> > > > Z 2018-03-06 14:00 0.89
> > > > Z 2018-03-06 15:00 0.01
> > > >
> > > > I'd like to graph it as three lines (X, Y and Z) over time, so
> the
> > > > graph
> > > > tool I am using asks for it this format:
> > > >
> > > >
> > > >
> > > > dt X Y
> > > > Z
> > > >
> > > > 2018-03-06 11:00 0.31 1.43 5.34
> > > > 2018-03-06 12:00 0.94 0.50 4.32
> > > > 2018-03-06 13:00 0.89 0.10 4.20
> > > > 2018-03-06 14:00 0.01 0.42 0.89
> > > > 2018-03-06 15:00 0.43 0.41 0.01
> > > >
> > > >
> > > > So I think that would be a PIVOT like function right (which I
> don't
> > > > think
> > > > Drill has) Is there a way to "fake" this in Drill using some
> other
> > > > built in
> > > > functions?
> > > >
> > > > Thanks!
> > > >
> > > > John
> > > >
> > > >
> > > >
> > >
> >
>
Re: Way to "pivot"
Posted by Kunal Khatua <ku...@gmail.com>.
Not until now :)
Can you file a JIRA so that we can track it?
On Tue, Mar 6, 2018 at 11:40 AM, John Omernik <jo...@omernik.com> wrote:
> Perfect. That works for me because I have a limited number of values, I
> could see that getting out of hand if the values were unknown. Has there
> been any talk of a Pivot function in Drill? That would be helpful so you
> didn't have to know the column names ahead of time.
>
> John
>
> On Tue, Mar 6, 2018 at 10:50 AM, Ted Dunning <te...@gmail.com>
> wrote:
>
> > Arjun's approach works even if the timestamps are not unique. Especially
> if
> > you use avg instead of max.
> >
> > On Mar 6, 2018 8:47 AM, "Arjun kr" <ar...@outlook.com> wrote:
> >
> > > If each timestamp has only one set of values for (x,y,z) , you can try
> > > something like below.
> > >
> > > select dt ,
> > > max(case when source='X' THEN `value` else 0.0 end) as X,
> > > max(case when source='Y' THEN `value` else 0.0 end) as Y,
> > > max(case when source='Z' THEN `value` else 0.0 end) as Z
> > > from
> > > <table>
> > > group by dt;
> > >
> > > Thanks,
> > >
> > > Arjun
> > >
> > >
> > >
> > > ________________________________
> > > From: Andries Engelbrecht <ae...@mapr.com>
> > > Sent: Tuesday, March 6, 2018 9:11 PM
> > > To: user@drill.apache.org
> > > Subject: Re: Way to "pivot"
> > >
> > > If the X, Y and Z is unique for each timestamp you can perhaps use
> group
> > > by (dt, X, Y , Z) and case to make the X, Y , Z columns. May be worth
> > > looking into, but is going to be expensive to execute. Just an idea,
> but
> > > have not tested it.
> > >
> > > --Andries
> > >
> > >
> > > On 3/6/18, 6:46 AM, "John Omernik" <jo...@omernik.com> wrote:
> > >
> > > I am not sure if this is the right thing for what I am trying to
> do,
> > > but I
> > > have data in this formate
> > >
> > >
> > > source dt value
> > > X 2018-03-06 11:00 0.31
> > > X 2018-03-06 12:00 0.94
> > > X 2018-03-06 13:00 0.89
> > > X 2018-03-06 14:00 0.01
> > > X 2018-03-06 15:00 0.43
> > > Y 2018-03-06 11:00 1.43
> > > Y 2018-03-06 12:00 0.50
> > > Y 2018-03-06 13:00 0.10
> > > Y 2018-03-06 14:00 0.42
> > > Y 2018-03-06 15:00 0.41
> > > Z 2018-03-06 11:00 5.34
> > > Z 2018-03-06 12:00 4.32
> > > Z 2018-03-06 13:00 4.20
> > > Z 2018-03-06 14:00 0.89
> > > Z 2018-03-06 15:00 0.01
> > >
> > > I'd like to graph it as three lines (X, Y and Z) over time, so the
> > > graph
> > > tool I am using asks for it this format:
> > >
> > >
> > >
> > > dt X Y
> > > Z
> > >
> > > 2018-03-06 11:00 0.31 1.43 5.34
> > > 2018-03-06 12:00 0.94 0.50 4.32
> > > 2018-03-06 13:00 0.89 0.10 4.20
> > > 2018-03-06 14:00 0.01 0.42 0.89
> > > 2018-03-06 15:00 0.43 0.41 0.01
> > >
> > >
> > > So I think that would be a PIVOT like function right (which I don't
> > > think
> > > Drill has) Is there a way to "fake" this in Drill using some other
> > > built in
> > > functions?
> > >
> > > Thanks!
> > >
> > > John
> > >
> > >
> > >
> >
>
Re: Way to "pivot"
Posted by John Omernik <jo...@omernik.com>.
Perfect. That works for me because I have a limited number of values, I
could see that getting out of hand if the values were unknown. Has there
been any talk of a Pivot function in Drill? That would be helpful so you
didn't have to know the column names ahead of time.
John
On Tue, Mar 6, 2018 at 10:50 AM, Ted Dunning <te...@gmail.com> wrote:
> Arjun's approach works even if the timestamps are not unique. Especially if
> you use avg instead of max.
>
> On Mar 6, 2018 8:47 AM, "Arjun kr" <ar...@outlook.com> wrote:
>
> > If each timestamp has only one set of values for (x,y,z) , you can try
> > something like below.
> >
> > select dt ,
> > max(case when source='X' THEN `value` else 0.0 end) as X,
> > max(case when source='Y' THEN `value` else 0.0 end) as Y,
> > max(case when source='Z' THEN `value` else 0.0 end) as Z
> > from
> > <table>
> > group by dt;
> >
> > Thanks,
> >
> > Arjun
> >
> >
> >
> > ________________________________
> > From: Andries Engelbrecht <ae...@mapr.com>
> > Sent: Tuesday, March 6, 2018 9:11 PM
> > To: user@drill.apache.org
> > Subject: Re: Way to "pivot"
> >
> > If the X, Y and Z is unique for each timestamp you can perhaps use group
> > by (dt, X, Y , Z) and case to make the X, Y , Z columns. May be worth
> > looking into, but is going to be expensive to execute. Just an idea, but
> > have not tested it.
> >
> > --Andries
> >
> >
> > On 3/6/18, 6:46 AM, "John Omernik" <jo...@omernik.com> wrote:
> >
> > I am not sure if this is the right thing for what I am trying to do,
> > but I
> > have data in this formate
> >
> >
> > source dt value
> > X 2018-03-06 11:00 0.31
> > X 2018-03-06 12:00 0.94
> > X 2018-03-06 13:00 0.89
> > X 2018-03-06 14:00 0.01
> > X 2018-03-06 15:00 0.43
> > Y 2018-03-06 11:00 1.43
> > Y 2018-03-06 12:00 0.50
> > Y 2018-03-06 13:00 0.10
> > Y 2018-03-06 14:00 0.42
> > Y 2018-03-06 15:00 0.41
> > Z 2018-03-06 11:00 5.34
> > Z 2018-03-06 12:00 4.32
> > Z 2018-03-06 13:00 4.20
> > Z 2018-03-06 14:00 0.89
> > Z 2018-03-06 15:00 0.01
> >
> > I'd like to graph it as three lines (X, Y and Z) over time, so the
> > graph
> > tool I am using asks for it this format:
> >
> >
> >
> > dt X Y
> > Z
> >
> > 2018-03-06 11:00 0.31 1.43 5.34
> > 2018-03-06 12:00 0.94 0.50 4.32
> > 2018-03-06 13:00 0.89 0.10 4.20
> > 2018-03-06 14:00 0.01 0.42 0.89
> > 2018-03-06 15:00 0.43 0.41 0.01
> >
> >
> > So I think that would be a PIVOT like function right (which I don't
> > think
> > Drill has) Is there a way to "fake" this in Drill using some other
> > built in
> > functions?
> >
> > Thanks!
> >
> > John
> >
> >
> >
>
Re: Way to "pivot"
Posted by Ted Dunning <te...@gmail.com>.
Arjun's approach works even if the timestamps are not unique. Especially if
you use avg instead of max.
On Mar 6, 2018 8:47 AM, "Arjun kr" <ar...@outlook.com> wrote:
> If each timestamp has only one set of values for (x,y,z) , you can try
> something like below.
>
> select dt ,
> max(case when source='X' THEN `value` else 0.0 end) as X,
> max(case when source='Y' THEN `value` else 0.0 end) as Y,
> max(case when source='Z' THEN `value` else 0.0 end) as Z
> from
> <table>
> group by dt;
>
> Thanks,
>
> Arjun
>
>
>
> ________________________________
> From: Andries Engelbrecht <ae...@mapr.com>
> Sent: Tuesday, March 6, 2018 9:11 PM
> To: user@drill.apache.org
> Subject: Re: Way to "pivot"
>
> If the X, Y and Z is unique for each timestamp you can perhaps use group
> by (dt, X, Y , Z) and case to make the X, Y , Z columns. May be worth
> looking into, but is going to be expensive to execute. Just an idea, but
> have not tested it.
>
> --Andries
>
>
> On 3/6/18, 6:46 AM, "John Omernik" <jo...@omernik.com> wrote:
>
> I am not sure if this is the right thing for what I am trying to do,
> but I
> have data in this formate
>
>
> source dt value
> X 2018-03-06 11:00 0.31
> X 2018-03-06 12:00 0.94
> X 2018-03-06 13:00 0.89
> X 2018-03-06 14:00 0.01
> X 2018-03-06 15:00 0.43
> Y 2018-03-06 11:00 1.43
> Y 2018-03-06 12:00 0.50
> Y 2018-03-06 13:00 0.10
> Y 2018-03-06 14:00 0.42
> Y 2018-03-06 15:00 0.41
> Z 2018-03-06 11:00 5.34
> Z 2018-03-06 12:00 4.32
> Z 2018-03-06 13:00 4.20
> Z 2018-03-06 14:00 0.89
> Z 2018-03-06 15:00 0.01
>
> I'd like to graph it as three lines (X, Y and Z) over time, so the
> graph
> tool I am using asks for it this format:
>
>
>
> dt X Y
> Z
>
> 2018-03-06 11:00 0.31 1.43 5.34
> 2018-03-06 12:00 0.94 0.50 4.32
> 2018-03-06 13:00 0.89 0.10 4.20
> 2018-03-06 14:00 0.01 0.42 0.89
> 2018-03-06 15:00 0.43 0.41 0.01
>
>
> So I think that would be a PIVOT like function right (which I don't
> think
> Drill has) Is there a way to "fake" this in Drill using some other
> built in
> functions?
>
> Thanks!
>
> John
>
>
>
Re: Way to "pivot"
Posted by Arjun kr <ar...@outlook.com>.
If each timestamp has only one set of values for (x,y,z) , you can try something like below.
select dt ,
max(case when source='X' THEN `value` else 0.0 end) as X,
max(case when source='Y' THEN `value` else 0.0 end) as Y,
max(case when source='Z' THEN `value` else 0.0 end) as Z
from
<table>
group by dt;
Thanks,
Arjun
________________________________
From: Andries Engelbrecht <ae...@mapr.com>
Sent: Tuesday, March 6, 2018 9:11 PM
To: user@drill.apache.org
Subject: Re: Way to "pivot"
If the X, Y and Z is unique for each timestamp you can perhaps use group by (dt, X, Y , Z) and case to make the X, Y , Z columns. May be worth looking into, but is going to be expensive to execute. Just an idea, but have not tested it.
--Andries
On 3/6/18, 6:46 AM, "John Omernik" <jo...@omernik.com> wrote:
I am not sure if this is the right thing for what I am trying to do, but I
have data in this formate
source dt value
X 2018-03-06 11:00 0.31
X 2018-03-06 12:00 0.94
X 2018-03-06 13:00 0.89
X 2018-03-06 14:00 0.01
X 2018-03-06 15:00 0.43
Y 2018-03-06 11:00 1.43
Y 2018-03-06 12:00 0.50
Y 2018-03-06 13:00 0.10
Y 2018-03-06 14:00 0.42
Y 2018-03-06 15:00 0.41
Z 2018-03-06 11:00 5.34
Z 2018-03-06 12:00 4.32
Z 2018-03-06 13:00 4.20
Z 2018-03-06 14:00 0.89
Z 2018-03-06 15:00 0.01
I'd like to graph it as three lines (X, Y and Z) over time, so the graph
tool I am using asks for it this format:
dt X Y Z
2018-03-06 11:00 0.31 1.43 5.34
2018-03-06 12:00 0.94 0.50 4.32
2018-03-06 13:00 0.89 0.10 4.20
2018-03-06 14:00 0.01 0.42 0.89
2018-03-06 15:00 0.43 0.41 0.01
So I think that would be a PIVOT like function right (which I don't think
Drill has) Is there a way to "fake" this in Drill using some other built in
functions?
Thanks!
John
Re: Way to "pivot"
Posted by Andries Engelbrecht <ae...@mapr.com>.
If the X, Y and Z is unique for each timestamp you can perhaps use group by (dt, X, Y , Z) and case to make the X, Y , Z columns. May be worth looking into, but is going to be expensive to execute. Just an idea, but have not tested it.
--Andries
On 3/6/18, 6:46 AM, "John Omernik" <jo...@omernik.com> wrote:
I am not sure if this is the right thing for what I am trying to do, but I
have data in this formate
source dt value
X 2018-03-06 11:00 0.31
X 2018-03-06 12:00 0.94
X 2018-03-06 13:00 0.89
X 2018-03-06 14:00 0.01
X 2018-03-06 15:00 0.43
Y 2018-03-06 11:00 1.43
Y 2018-03-06 12:00 0.50
Y 2018-03-06 13:00 0.10
Y 2018-03-06 14:00 0.42
Y 2018-03-06 15:00 0.41
Z 2018-03-06 11:00 5.34
Z 2018-03-06 12:00 4.32
Z 2018-03-06 13:00 4.20
Z 2018-03-06 14:00 0.89
Z 2018-03-06 15:00 0.01
I'd like to graph it as three lines (X, Y and Z) over time, so the graph
tool I am using asks for it this format:
dt X Y Z
2018-03-06 11:00 0.31 1.43 5.34
2018-03-06 12:00 0.94 0.50 4.32
2018-03-06 13:00 0.89 0.10 4.20
2018-03-06 14:00 0.01 0.42 0.89
2018-03-06 15:00 0.43 0.41 0.01
So I think that would be a PIVOT like function right (which I don't think
Drill has) Is there a way to "fake" this in Drill using some other built in
functions?
Thanks!
John