You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kylin.apache.org by Vadim Semenov <_...@databuryat.com> on 2015/10/27 05:03:10 UTC
KYLIN-747 bad query performance when IN clause contains a value
doesn't exist in the dictionary
Hi everyone,
I'm trying to find the commit where this issue was fixed (https://issues.apache.org/jira/browse/KYLIN-747).
Could you point me?
We have a cube that is partitioned by date, and when we query using SELECT * FROM table WHERE dt = '2015-10-01', I see in the logs:
"Can't translate value 2015-10-01 to dictionary ID, roundingFlag 0. Using default value \xFF",
which translates into a huge scan.
Re: KYLIN-747 bad query performance when IN clause contains a
value doesn't exist in the dictionary
Posted by Vadim Semenov <_...@databuryat.com>.
Thank you!
I just tested the changes with queries like:
SELECT dt, SUM(metric) FROM table WHERE dt IN ('2015-10-01', '2015-10-02') GROUP BY dt;
SELECT dt, SUM(metric) FROM table WHERE dt BETWEEN '2015-10-01' AND '2015-10-07' GROUP BY dt;
before every partition was scanned, after the changes only relevant partitions are scanned.
Very useful performance change, thanks again.
On October 28, 2015 at 4:02:19 AM, Li Yang (liyang@apache.org) wrote:
Em.. I searched commits on KYLIN-747 and get nothing too. Thought it was
covered by fix to some other JIRA.
Anyway, I cooked some test cases to query non-existing values, and made
some further optimization. Now the ever-false scan range can be pruned.
https://github.com/apache/incubator-kylin/commit/f96600e89a5a2c0e533cea86ad6a73b9451bcddc
On Tue, Oct 27, 2015 at 12:03 PM, Vadim Semenov <_...@databuryat.com> wrote:
> Hi everyone,
>
> I'm trying to find the commit where this issue was fixed (
> https://issues.apache.org/jira/browse/KYLIN-747).
> Could you point me?
>
> We have a cube that is partitioned by date, and when we query using SELECT
> * FROM table WHERE dt = '2015-10-01', I see in the logs:
> "Can't translate value 2015-10-01 to dictionary ID, roundingFlag 0. Using
> default value \xFF",
> which translates into a huge scan.
Re: KYLIN-747 bad query performance when IN clause contains a value
doesn't exist in the dictionary
Posted by Li Yang <li...@apache.org>.
Em.. I searched commits on KYLIN-747 and get nothing too. Thought it was
covered by fix to some other JIRA.
Anyway, I cooked some test cases to query non-existing values, and made
some further optimization. Now the ever-false scan range can be pruned.
https://github.com/apache/incubator-kylin/commit/f96600e89a5a2c0e533cea86ad6a73b9451bcddc
On Tue, Oct 27, 2015 at 12:03 PM, Vadim Semenov <_...@databuryat.com> wrote:
> Hi everyone,
>
> I'm trying to find the commit where this issue was fixed (
> https://issues.apache.org/jira/browse/KYLIN-747).
> Could you point me?
>
> We have a cube that is partitioned by date, and when we query using SELECT
> * FROM table WHERE dt = '2015-10-01', I see in the logs:
> "Can't translate value 2015-10-01 to dictionary ID, roundingFlag 0. Using
> default value \xFF",
> which translates into a huge scan.
Re: KYLIN-747 bad query performance when IN clause contains a value
doesn't exist in the dictionary
Posted by Luke Han <lu...@gmail.com>.
the fixed version is v1.1. please try with latest released one.
Thanks.
Best Regards!
---------------------
Luke Han
On Tue, Oct 27, 2015 at 5:14 PM, hongbin ma <ma...@apache.org> wrote:
> @liyang
>
> On Tue, Oct 27, 2015 at 12:03 PM, Vadim Semenov <_...@databuryat.com> wrote:
>
> > Hi everyone,
> >
> > I'm trying to find the commit where this issue was fixed (
> > https://issues.apache.org/jira/browse/KYLIN-747).
> > Could you point me?
> >
> > We have a cube that is partitioned by date, and when we query using
> SELECT
> > * FROM table WHERE dt = '2015-10-01', I see in the logs:
> > "Can't translate value 2015-10-01 to dictionary ID, roundingFlag 0. Using
> > default value \xFF",
> > which translates into a huge scan.
>
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>
Re: KYLIN-747 bad query performance when IN clause contains a value
doesn't exist in the dictionary
Posted by hongbin ma <ma...@apache.org>.
@liyang
On Tue, Oct 27, 2015 at 12:03 PM, Vadim Semenov <_...@databuryat.com> wrote:
> Hi everyone,
>
> I'm trying to find the commit where this issue was fixed (
> https://issues.apache.org/jira/browse/KYLIN-747).
> Could you point me?
>
> We have a cube that is partitioned by date, and when we query using SELECT
> * FROM table WHERE dt = '2015-10-01', I see in the logs:
> "Can't translate value 2015-10-01 to dictionary ID, roundingFlag 0. Using
> default value \xFF",
> which translates into a huge scan.
--
Regards,
*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone