You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by James Taylor <ja...@apache.org> on 2015/03/01 04:57:38 UTC

Re: high cardinality aggregation query performance

Gary,
I've got a patch available on PHOENIX-1690 that fixes the issue for my
tests. Would you mind giving it a whirl?
Thanks,
James

On Fri, Feb 27, 2015 at 6:40 PM, James Taylor <ja...@apache.org> wrote:
> Thanks, Gary. That should be enough for me to repro (though it's a lot
> of data!).
>
> I've always had to hack up the hbase shell script for remote debugging
> and then it seems to work.
>
> On Fri, Feb 27, 2015 at 6:37 PM, Gary Schulte
> <gs...@marinsoftware.com> wrote:
>> 509 guideposts according to system.stats, getting the table via runtime
>> seems to work, guide posts, here: http://goo.gl/jvcFec
>>
>>
>> As an aside, I am having issues getting a connection to phoenix/hbase
>> remotely (so I can debug from my IDE).  I have all the ports open that I
>> think would play a part - am I missing anything?
>> 2181,49255,60000,60010,60020,60030,8080,8085,9090, and 9095.  Connections
>> from remote just hang and I never get an error or a stack trace.
>>
>> Thx
>>
>> -Gary
>>
>>
>> On Fri, Feb 27, 2015 at 5:53 PM, James Taylor <ja...@apache.org>
>> wrote:
>>>
>>> Try this code snippet to see if we can force the stats to be send over:
>>>
>>> conn.unwrap(PhoenixConnection.class).getQueryServices().clearCache();
>>> PTable table = PhoenixRuntime.getTable(conn, "PERF.BIG_OLAP_DOC");
>>> for (GuidePostsInfo info : table.getTableStats().getGuidePosts().values())
>>> {
>>>     for (byte[] gp : info.getGuidePosts()) {
>>>         System.out.println(Bytes.toStringBinary(gp));
>>>     }
>>> }
>>>
>>> Also, try this query and let me know what it says:
>>>     SELECT sum(GUIDE_POSTS_COUNT)
>>>     FROM SYSTEM.STATS
>>>     WHERE PHYSICAL_NAME = "PERF.BIG_OLAP_DOC";
>>>
>>> The UPDATE STATISTICS command timing out on the client prevented the
>>> client-side to pull over the new stats until it was complete on the
>>> server-side (that's why you only saw it later).
>>>
>>> Thanks,
>>> James
>>>
>>> On Fri, Feb 27, 2015 at 5:42 PM, Gary Schulte
>>> <gs...@marinsoftware.com> wrote:
>>> > It appears I spoke too soon.  Presumably once the stats completed
>>> > updating,
>>> > I now get the same exception:
>>> >
>>> > java.lang.IndexOutOfBoundsException: end index (174) must not be less
>>> > than
>>> > start index (226)
>>> > at
>>> >
>>> > com.google.common.base.Preconditions.checkPositionIndexes(Preconditions.java:388)
>>> > at
>>> > com.google.common.collect.ImmutableList.subList(ImmutableList.java:362)
>>> > at
>>> > com.google.common.collect.ImmutableList.subList(ImmutableList.java:62)
>>> > at
>>> >
>>> > org.apache.phoenix.filter.SkipScanFilter.intersect(SkipScanFilter.java:291)
>>> > at
>>> >
>>> > org.apache.phoenix.filter.SkipScanFilter.intersect(SkipScanFilter.java:177)
>>> > at
>>> > org.apache.phoenix.compile.ScanRanges.intersectScan(ScanRanges.java:316)
>>> > at
>>> >
>>> > org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:464)
>>> > at
>>> >
>>> > org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:394)
>>> > at
>>> >
>>> > org.apache.phoenix.iterate.BaseResultIterators.<init>(BaseResultIterators.java:184)
>>> > at
>>> >
>>> > org.apache.phoenix.iterate.ParallelIterators.<init>(ParallelIterators.java:54)
>>> > at
>>> >
>>> > org.apache.phoenix.execute.AggregatePlan.newIterator(AggregatePlan.java:173)
>>> > at
>>> > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:227)
>>> > at
>>> > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:154)
>>> > at
>>> >
>>> > org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:226)
>>> > at
>>> >
>>> > org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:217)
>>> > at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>>> > at
>>> >
>>> > org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:216)
>>> > at
>>> >
>>> > org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1057)
>>> > at sqlline.SqlLine$Commands.execute(SqlLine.java:3673)
>>> > at sqlline.SqlLine$Commands.sql(SqlLine.java:3584)
>>> > at sqlline.SqlLine.dispatch(SqlLine.java:821)
>>> > at sqlline.SqlLine.begin(SqlLine.java:699)
>>> > at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441)
>>> > at sqlline.SqlLine.main(SqlLine.java:424)
>>> >
>>> >
>>> > Still null for the guideposts though, both with PName.EMPTY_NAME and
>>> > null
>>> > for the PTableKey constructor.
>>> >
>>> > It certainly appears to be stats related.
>>> >
>>> > -Gary
>>> >
>>> > On Fri, Feb 27, 2015 at 4:06 PM, Gary Schulte
>>> > <gs...@marinsoftware.com>
>>> > wrote:
>>> >>
>>> >> I have the query timeout set too low, but I believe the stats update
>>> >> completed as I see related rows in the stats table.
>>> >>
>>> >> Both skip and in-list queries run fine - no exceptions.  Still null for
>>> >> the guideposts though - is it likely this is due to the timeout in the
>>> >> stats
>>> >> update?
>>> >>
>>> >> -Gary
>>> >>
>>> >> On Fri, Feb 27, 2015 at 12:30 PM, James Taylor <ja...@apache.org>
>>> >> wrote:
>>> >>>
>>> >>> See inline. Thanks for your help on this one, Gary. It'd be good to
>>> >>> get to the bottom of it so it doesn't bite you again.
>>> >>>
>>> >>> On Fri, Feb 27, 2015 at 11:13 AM, Gary Schulte
>>> >>> <gs...@marinsoftware.com> wrote:
>>> >>> > James,
>>> >>> >
>>> >>> > When I simply added the skip scan hint, I got the same exception
>>> >>> > (even
>>> >>> > with
>>> >>> > device_type criteria removed) but the indexes in the exception
>>> >>> > changed.
>>> >>> > Interesting - I wouldn't have expected adding a skip scan hint would
>>> >>> > have
>>> >>> > altered the plan, since it was already doing a skip scan.
>>> >>>
>>> >>> The hint tells Phoenix to keep using PK columns in the skip scan,
>>> >>> rather than stopping at the first PK column that isn't being filtered
>>> >>> on. We don't do this by default, because if the cardinality was very
>>> >>> high, we wouldn't want to do this. Since our stats don't yet capture
>>> >>> cardinality, we can't yet automatically do this.
>>> >>>
>>> >>> >
>>> >>> > 1: current region boundaries, linked so as not to clutter the list
>>> >>> > with
>>> >>> > hex
>>> >>> > :  http://goo.gl/hFSzYJ
>>> >>> >
>>> >>> > 2: table stats/guideposts, it looks like there are/were none.  The
>>> >>> > output
>>> >>> > from the guidepost loop was :
>>> >>> >
>>> >>> > null
>>> >>> >
>>> >>> > (this was prior to deleting from system.stats)
>>> >>> >
>>> >>> >
>>> >>> > 3: deleting system.stats appears to have resolved the exception for
>>> >>> > both the
>>> >>> > explicit varchar inlist and the skip_scan hint.  Skip scanning the
>>> >>> > reduced
>>> >>> > index space yields much faster results, ~5 seconds as opposed to 27.
>>> >>>
>>> >>> Glad to hear it's faster with the hint forcing the skip scan across
>>> >>> all your columns.
>>> >>>
>>> >>> I can't explain why deleting the stats resolved the issue, though, as
>>> >>> I would have expected (2) to have returned the stats. FYI, it looks
>>> >>> like the bug is in the code that intersects the guideposts with the
>>> >>> region boundaries.
>>> >>>
>>> >>> >
>>> >>> > Should I expect to rebuild stats often or is this more of an error
>>> >>> > case?
>>> >>>
>>> >>> Stats are rebuilt automatically when a major compaction occurs and are
>>> >>> updated as splits happen. They can also be manually updated by running
>>> >>> the following command:
>>> >>>
>>> >>> UPDATE STATISTICS PERF.BIG_OLAP_DOC
>>> >>>
>>> >>> For more info on stats, see
>>> >>> http://phoenix.apache.org/update_statistics.html
>>> >>>
>>> >>> If you run this command, does the problem start to reoccur? If so,
>>> >>> would you mind adding this command before running the loop to collect
>>> >>> the guideposts and let me know if you see that stats output?
>>> >>>
>>> >>> >
>>> >>> > Thanks again.
>>> >>> >
>>> >>> >
>>> >>> >
>>> >>> > On Thu, Feb 26, 2015 at 5:55 PM, James Taylor
>>> >>> > <ja...@apache.org>
>>> >>> > wrote:
>>> >>> >>
>>> >>> >> Gary,
>>> >>> >> I'm not able to repro the issue - I filed PHOENIX-1690 to track it
>>> >>> >> and
>>> >>> >> attached my test case there. It looks related to the particular
>>> >>> >> state
>>> >>> >> the table is in wrt its region boundaries and current statistics,
>>> >>> >> so
>>> >>> >> I'll need the following additional information to try to help me
>>> >>> >> repro
>>> >>> >> this:
>>> >>> >>
>>> >>> >> 1) What are the current region boundaries of your table? You can
>>> >>> >> get
>>> >>> >> this programmatically through code like this:
>>> >>> >>
>>> >>> >>         Connection conn = DriverManager.getConnection(getUrl(),
>>> >>> >> props);
>>> >>> >>         List<HRegionLocation> splits =
>>> >>> >>
>>> >>> >>
>>> >>> >>
>>> >>> >> conn.unwrap(PhoenixConnection.class).getQueryServices().getAllTableRegions(Bytes.toBytes("PERF.BIG_OLAP_DOC"));
>>> >>> >>         for (HRegionLocation split : splits) {
>>> >>> >>
>>> >>> >>
>>> >>> >>
>>> >>> >> System.out.println(Bytes.toStringBinary(split.getRegionInfo().getEndKey()));
>>> >>> >>         }
>>> >>> >>
>>> >>> >> 2) What are the current stats for the table. You can get this by
>>> >>> >> programmatically through code like this:
>>> >>> >>
>>> >>> >>         PTable table =
>>> >>> >>
>>> >>> >> conn.unwrap(PhoenixConnection.class).getMetaDataCache().getTable(new
>>> >>> >> PTableKey(null, "PERF.BIG_OLAP_DOC"));
>>> >>> >>         for (GuidePostsInfo info :
>>> >>> >> table.getTableStats().getGuidePosts().values()) {
>>> >>> >>             for (byte[] gp : info.getGuidePosts()) {
>>> >>> >>                 System.out.println(Bytes.toStringBinary(gp));
>>> >>> >>             }
>>> >>> >>         }
>>> >>> >>
>>> >>> >> 3) If you can try after removing all rows from the SYSTEM.STATS
>>> >>> >> table
>>> >>> >> and let me know if the problem still occurs, that'd be helpful too.
>>> >>> >> You can just do the following from sqlline: DELETE FROM
>>> >>> >> SYSTEM.STATS
>>> >>> >> and then exit sqlline, start it again, and rerun the original
>>> >>> >> query.
>>> >>> >>
>>> >>> >> Thanks,
>>> >>> >> James
>>> >>> >>
>>> >>> >> On Thu, Feb 26, 2015 at 10:52 AM, James Taylor
>>> >>> >> <ja...@apache.org>
>>> >>> >> wrote:
>>> >>> >> > Gary,
>>> >>> >> > One possible workaround. Can you try adding the SKIP_SCAN hint to
>>> >>> >> > your
>>> >>> >> > query (instead of the AND device_type in
>>> >>> >> > ('MOBILE','DESKTOP','OTHER','TABLET')), like this?
>>> >>> >> >
>>> >>> >> > SELECT /*+ SKIP_SCAN */ count(1) cnt,
>>> >>> >> > ...
>>> >>> >> >
>>> >>> >> > Thanks,
>>> >>> >> > James
>>> >>> >> >
>>> >>> >> > On Wed, Feb 25, 2015 at 10:16 AM, James Taylor
>>> >>> >> > <ja...@apache.org>
>>> >>> >> > wrote:
>>> >>> >> >> Sounds like a bug. I'll try to repro on my end. Thanks for the
>>> >>> >> >> details,
>>> >>> >> >> Gary.
>>> >>> >> >>
>>> >>> >> >>     James
>>> >>> >> >>
>>> >>> >> >> On Tue, Feb 24, 2015 at 1:49 PM, Gary Schulte
>>> >>> >> >> <gs...@marinsoftware.com> wrote:
>>> >>> >> >>> On Tue, Feb 24, 2015 at 12:29 AM, James Taylor
>>> >>> >> >>> <ja...@apache.org>
>>> >>> >> >>> wrote:
>>> >>> >> >>>>
>>> >>> >> >>>> Based on your query plan, the skip scan is being done solely
>>> >>> >> >>>> based on
>>> >>> >> >>>> your
>>> >>> >> >>>> salt bucket while the rest of the filtering is being done by a
>>> >>> >> >>>> filter, which
>>> >>> >> >>>> means that you're not filtering based on the leading part of
>>> >>> >> >>>> your
>>> >>> >> >>>> primary
>>> >>> >> >>>> key. We'll know more once you post your schema, but if
>>> >>> >> >>>> NETWORK,
>>> >>> >> >>>> KEYWORD_ID
>>> >>> >> >>>> and CUSTOMER_ID formed your primary key constraint, then the
>>> >>> >> >>>> skip
>>> >>> >> >>>> scan would
>>> >>> >> >>>> work well.
>>> >>> >> >>>>
>>> >>> >> >>>
>>> >>> >> >>> Thanks for your response James.  Sorry for the slow reply - I
>>> >>> >> >>> had
>>> >>> >> >>> difficulty
>>> >>> >> >>> finding the exact set of test queries I was using for timings.
>>> >>> >> >>>
>>> >>> >> >>> The relevant portion of the olap doc schema is:
>>> >>> >> >>>
>>> >>> >> >>> create table PERF.BIG_OLAP_DOC (
>>> >>> >> >>> client_id                           integer not null
>>> >>> >> >>> ,customer_id                        integer
>>> >>> >> >>> ,time_id                            integer not null
>>> >>> >> >>> ,conversion_type_id                 integer not null
>>> >>> >> >>> ,device_type                        varchar(16)
>>> >>> >> >>> ,keyword_id                         bigint not null
>>> >>> >> >>> ,creative_id                        bigint not null
>>> >>> >> >>> ,placement_id                       bigint not null
>>> >>> >> >>> ,product_target_id                  bigint not null
>>> >>> >> >>> ,network                            varchar(7)
>>> >>> >> >>> ,impressions                        decimal(18, 4)
>>> >>> >> >>> ,publisher_clicks                   decimal(18, 4)
>>> >>> >> >>> ,publisher_cost                     decimal(18, 4)
>>> >>> >> >>> ,conversions                        decimal(18, 4)
>>> >>> >> >>> ,revenue                            decimal(18, 4)
>>> >>> >> >>>
>>> >>> >> >>> [ ...additional metric and dimensional colums ... ]
>>> >>> >> >>>
>>> >>> >> >>>     constraint perf_fact_pk primary key (client_id, time_id,
>>> >>> >> >>> conversion_type_id, device_type, keyword_id, creative_id,
>>> >>> >> >>> placement_id,
>>> >>> >> >>> product_target_id))SALT_BUCKETS=10;
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> I am evaluating a 'stitch' case where results from an external
>>> >>> >> >>> system
>>> >>> >> >>> are
>>> >>> >> >>> injected either via table or (as in this case) an in-list.  An
>>> >>> >> >>> example
>>> >>> >> >>> of
>>> >>> >> >>> one of these test agg queries I am using is:
>>> >>> >> >>>
>>> >>> >> >>> SELECT count(1) cnt,
>>> >>> >> >>>        coalesce(SUM(impressions), 0.0) AS "impressions",
>>> >>> >> >>>        coalesce(SUM(publisher_clicks), 0.0) AS "pub_clicks",
>>> >>> >> >>>        coalesce(SUM(publisher_cost), 0.0) AS "pub_cost",
>>> >>> >> >>>        coalesce(SUM(conversions), 0.0) AS "conversions",
>>> >>> >> >>>        coalesce(SUM(revenue), 0.0) AS "revenue"
>>> >>> >> >>>   FROM perf.big_olap_doc
>>> >>> >> >>>  WHERE time_id between 3000 and 3700
>>> >>> >> >>>    AND network in ('SEARCH')
>>> >>> >> >>>    AND conversion_type_id = 1
>>> >>> >> >>>    AND client_id = 10724
>>> >>> >> >>> --   AND device_type in ('MOBILE','DESKTOP','OTHER','TABLET')
>>> >>> >> >>>    AND keyword_id in (
>>> >>> >> >>> 613214369, 613217307, 613247509, 613248897, 613250382,
>>> >>> >> >>> 613250387,
>>> >>> >> >>> 613252322,
>>> >>> >> >>> 613260252, 613261753, 613261754, 613261759,
>>> >>> >> >>> 613261770, 613261873, 613261884, 613261885, 613261888,
>>> >>> >> >>> 613261889,
>>> >>> >> >>> 613261892,
>>> >>> >> >>> 613261897, 613261913, 613261919, 613261927,
>>> >>> >> >>> 614496021, 843606367, 843606967, 843607021, 843607033,
>>> >>> >> >>> 843607089,
>>> >>> >> >>> 1038731600, 1038731672, 1038731673, 1038731675,
>>> >>> >> >>> 1038731684, 1038731693, 1046990487, 1046990488, 1046990499,
>>> >>> >> >>> 1046990505,
>>> >>> >> >>> 1046990506, 1049724722, 1051109548, 1051311275,
>>> >>> >> >>> 1051311904, 1060574377, 1060574395, 1060574506, 1060574562,
>>> >>> >> >>> 1115915938,
>>> >>> >> >>> 1115915939, 1115915941, 1116310571, 1367495544,
>>> >>> >> >>> 1367495545, 1367497297, 1367497298, 1367497299, 1367497300,
>>> >>> >> >>> 1367497303,
>>> >>> >> >>> 1367497313, 1367497813, 1367497816, 1367497818,
>>> >>> >> >>> 1367497821, 1367497822, 1367497823, 1624976423, 1624976451,
>>> >>> >> >>> 1624976457,
>>> >>> >> >>> 3275636061, 3275640505, 3275645765, 3275645807,
>>> >>> >> >>> 3275649138, 3275651456, 3275651460, 3275651478, 3275651479,
>>> >>> >> >>> 3275654566,
>>> >>> >> >>> 3275654568, 3275654570, 3275654575, 3275659612,
>>> >>> >> >>> 3275659616, 3275659620, 3275668880, 3275669693, 3275675627,
>>> >>> >> >>> 3275675634,
>>> >>> >> >>> 3275677479, 3275677504, 3275678855, 3275679524,
>>> >>> >> >>> 3275679532, 3275680014, 3275682307, 3275682308, 3275682309,
>>> >>> >> >>> 3275682310,
>>> >>> >> >>> 3275682420, 3275682423, 3275682436, 3275682448,
>>> >>> >> >>> 3275682460, 3275682462, 3275682474, 3275684831, 3275688903,
>>> >>> >> >>> 3275694023,
>>> >>> >> >>> 3275694025, 3275694027, 3275695054, 3275695056,
>>> >>> >> >>> 3275695062, 3275699512, 3275699514, 3275699518, 3275701682,
>>> >>> >> >>> 3275701683,
>>> >>> >> >>> 3275701685, 3275701688, 3275703633, 3275703634,
>>> >>> >> >>> 3275703635, 3275703636, 3275703638, 3275703639, 3275704860,
>>> >>> >> >>> 3275704861,
>>> >>> >> >>> 3275764577, 3275797149, 3275798566, 3275798567,
>>> >>> >> >>> 3275798568, 3275798592, 3275931147, 3275942728, 3275945337,
>>> >>> >> >>> 3275945338,
>>> >>> >> >>> 3275945339, 3275945340, 3275945342, 3275945344,
>>> >>> >> >>> 3275946319, 3275946322, 3275946324, 3275946643, 3275949495,
>>> >>> >> >>> 3275949498,
>>> >>> >> >>> 3275949500, 3275950250, 3275955128, 3275955129,
>>> >>> >> >>> 3275955130, 3427017435, 3427017450, 3438304254, 3438304257,
>>> >>> >> >>> 3447068169,
>>> >>> >> >>> 3505227849, 3505227890, 3505556908, 3506351285,
>>> >>> >> >>> 3506351389, 3506351398, 3506351468, 3510037138, 3510038610,
>>> >>> >> >>> 3545590644,
>>> >>> >> >>> 3545594378, 3545595073, 3545595318, 3545595506,
>>> >>> >> >>> 3545597841, 3545598818, 3545599658, 3545599663, 3545601215,
>>> >>> >> >>> 3556080898,
>>> >>> >> >>> 3556080980, 3556080999, 3556081323, 3565122663,
>>> >>> >> >>> 3565122679, 3565122801, 3565122858, 3565122908, 3565122929,
>>> >>> >> >>> 3565122952,
>>> >>> >> >>> 3565122984, 3565123028, 3565123047, 3565123048,
>>> >>> >> >>> 3565123203, 3565123230, 3949988054, 3949988056, 3949988070,
>>> >>> >> >>> 3972992248,
>>> >>> >> >>> 3972992252, 3972992254, 3972992257, 3972992263,
>>> >>> >> >>> 3972992267, 3972992268, 3972992269, 3972992270, 3972992274,
>>> >>> >> >>> 3972992275,
>>> >>> >> >>> 3972992277, 3972992281, 3972992293, 3972992298,
>>> >>> >> >>> 3972992299, 3972992305, 3972992307, 3972992313, 3972992316,
>>> >>> >> >>> 3972992322,
>>> >>> >> >>> 3972992338, 3978471261, 3978471272, 4266318185,
>>> >>> >> >>> 4298107404, 4308853119, 4308853123, 4308853500, 4451174646,
>>> >>> >> >>> 4451174656,
>>> >>> >> >>> 4451174701, 4569827278, 4569827284, 4569827287,
>>> >>> >> >>> 4569827379, 4569827523, 4569827524, 4896589676, 4979049725,
>>> >>> >> >>> 5054587609,
>>> >>> >> >>> 5136433884, 5362640372, 5393109964, 5393405364,
>>> >>> >> >>> 5393405365, 5393405620, 5393405625, 5393405675, 5393405677,
>>> >>> >> >>> 5393405858,
>>> >>> >> >>> 5393405970)
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> Reading your interpretation of the skip scan, I see that the
>>> >>> >> >>> plan
>>> >>> >> >>> is
>>> >>> >> >>> indicating it is only using the salt and the first three
>>> >>> >> >>> columns
>>> >>> >> >>> of
>>> >>> >> >>> the
>>> >>> >> >>> index, client_id, and time_id and conversion_type.  I hadn't
>>> >>> >> >>> considered the
>>> >>> >> >>> salt - that bit of detail in the plan makes more sense to me
>>> >>> >> >>> now.
>>> >>> >> >>> It
>>> >>> >> >>> looks
>>> >>> >> >>> now like the lackluster performance for higher cardinality
>>> >>> >> >>> aggregations is
>>> >>> >> >>> related to scanning a much larger portion of the key space.
>>> >>> >> >>> For
>>> >>> >> >>> aggregations where I am not relying on filtering, I am seeing
>>> >>> >> >>> much
>>> >>> >> >>> better
>>> >>> >> >>> performance.
>>> >>> >> >>>
>>> >>> >> >>> So to tune this particular stitch case / skip scan, it looks
>>> >>> >> >>> like
>>> >>> >> >>> I
>>> >>> >> >>> need to
>>> >>> >> >>> get the 4th index column into the criteria.  There are only
>>> >>> >> >>> four
>>> >>> >> >>> distinct
>>> >>> >> >>> values in the fourth index column (these can/should probably be
>>> >>> >> >>> something
>>> >>> >> >>> other than varchar, but this is what I have loaded currently).
>>> >>> >> >>> In
>>> >>> >> >>> order to
>>> >>> >> >>> use the keyword_id portion of the index I tried explicitly
>>> >>> >> >>> specifying
>>> >>> >> >>> all
>>> >>> >> >>> device_types via in-list (the commented portion of the query
>>> >>> >> >>> above),
>>> >>> >> >>> but I
>>> >>> >> >>> get a peculiar error:
>>> >>> >> >>>
>>> >>> >> >>> java.lang.IndexOutOfBoundsException: end index (1) must not be
>>> >>> >> >>> less
>>> >>> >> >>> than
>>> >>> >> >>> start index (2)
>>> >>> >> >>> at
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> com.google.common.base.Preconditions.checkPositionIndexes(Preconditions.java:388)
>>> >>> >> >>> at
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> com.google.common.collect.ImmutableList.subList(ImmutableList.java:362)
>>> >>> >> >>> at
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> com.google.common.collect.ImmutableList.subList(ImmutableList.java:62)
>>> >>> >> >>> at
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> org.apache.phoenix.filter.SkipScanFilter.intersect(SkipScanFilter.java:291)
>>> >>> >> >>> at
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> org.apache.phoenix.filter.SkipScanFilter.intersect(SkipScanFilter.java:177)
>>> >>> >> >>> at
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> org.apache.phoenix.compile.ScanRanges.intersectScan(ScanRanges.java:316)
>>> >>> >> >>> at
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:464)
>>> >>> >> >>> at
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:394)
>>> >>> >> >>> at
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> org.apache.phoenix.iterate.BaseResultIterators.<init>(BaseResultIterators.java:184)
>>> >>> >> >>> at
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> org.apache.phoenix.iterate.ParallelIterators.<init>(ParallelIterators.java:54)
>>> >>> >> >>> at
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> org.apache.phoenix.execute.AggregatePlan.newIterator(AggregatePlan.java:173)
>>> >>> >> >>> at
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:227)
>>> >>> >> >>> at
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:154)
>>> >>> >> >>> at
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:226)
>>> >>> >> >>> at
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:217)
>>> >>> >> >>> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>>> >>> >> >>> at
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:216)
>>> >>> >> >>> at
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1057)
>>> >>> >> >>> at sqlline.SqlLine$Commands.execute(SqlLine.java:3673)
>>> >>> >> >>> at sqlline.SqlLine$Commands.sql(SqlLine.java:3584)
>>> >>> >> >>> at sqlline.SqlLine.dispatch(SqlLine.java:821)
>>> >>> >> >>> at sqlline.SqlLine.begin(SqlLine.java:699)
>>> >>> >> >>> at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441)
>>> >>> >> >>> at sqlline.SqlLine.main(SqlLine.java:424)
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> I thought perhaps I was hitting an upper limit on the number of
>>> >>> >> >>> elements in
>>> >>> >> >>> an in-list for a skip scan, and so tried removing the 250
>>> >>> >> >>> element
>>> >>> >> >>> keyword
>>> >>> >> >>> in-list entirely and leaving only the device_type in-list, but
>>> >>> >> >>> I
>>> >>> >> >>> still
>>> >>> >> >>> get
>>> >>> >> >>> the same error.  It happens immediately, even for an explain,
>>> >>> >> >>> so I
>>> >>> >> >>> presume
>>> >>> >> >>> this is a query parsing problem.  Is there a bug or limitation
>>> >>> >> >>> of
>>> >>> >> >>> skip
>>> >>> >> >>> scans
>>> >>> >> >>> and/or sub lists involving varchar?
>>> >>> >> >>>
>>> >>> >> >>> Thx
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >
>>> >>> >
>>> >>
>>> >>
>>> >
>>
>>

Re: high cardinality aggregation query performance

Posted by Gary Schulte <gs...@marinsoftware.com>.
Confirmed - I am not encountering the problem with the v2 patch.

Thanks again

On Mon, Mar 2, 2015 at 2:41 PM, James Taylor <ja...@apache.org> wrote:

> Thanks, Gary. Please let me know once you've tried to simplified patch
> and then I'll get this checked in and into the next point release
> (4.3.1).
>
> On Mon, Mar 2, 2015 at 12:53 PM, Gary Schulte
> <gs...@marinsoftware.com> wrote:
> > I patched using the original (not simplified) patch against 4.3.1 and it
> > appears to have fixed the issue.  I updated stats and waited for it to
> > complete and can no longer reproduce the problem.  I will give the new
> patch
> > a whirl also just for grins.
> >
> > Thanks,
> >
> > Gary
> >
> > On Sat, Feb 28, 2015 at 7:57 PM, James Taylor <ja...@apache.org>
> > wrote:
> >>
> >> Gary,
> >> I've got a patch available on PHOENIX-1690 that fixes the issue for my
> >> tests. Would you mind giving it a whirl?
> >> Thanks,
> >> James
> >>
> >> On Fri, Feb 27, 2015 at 6:40 PM, James Taylor <ja...@apache.org>
> >> wrote:
> >> > Thanks, Gary. That should be enough for me to repro (though it's a lot
> >> > of data!).
> >> >
> >> > I've always had to hack up the hbase shell script for remote debugging
> >> > and then it seems to work.
> >> >
> >> > On Fri, Feb 27, 2015 at 6:37 PM, Gary Schulte
> >> > <gs...@marinsoftware.com> wrote:
> >> >> 509 guideposts according to system.stats, getting the table via
> runtime
> >> >> seems to work, guide posts, here: http://goo.gl/jvcFec
> >> >>
> >> >>
> >> >> As an aside, I am having issues getting a connection to phoenix/hbase
> >> >> remotely (so I can debug from my IDE).  I have all the ports open
> that
> >> >> I
> >> >> think would play a part - am I missing anything?
> >> >> 2181,49255,60000,60010,60020,60030,8080,8085,9090, and 9095.
> >> >> Connections
> >> >> from remote just hang and I never get an error or a stack trace.
> >> >>
> >> >> Thx
> >> >>
> >> >> -Gary
> >> >>
> >> >>
> >> >> On Fri, Feb 27, 2015 at 5:53 PM, James Taylor <
> jamestaylor@apache.org>
> >> >> wrote:
> >> >>>
> >> >>> Try this code snippet to see if we can force the stats to be send
> >> >>> over:
> >> >>>
> >> >>>
> conn.unwrap(PhoenixConnection.class).getQueryServices().clearCache();
> >> >>> PTable table = PhoenixRuntime.getTable(conn, "PERF.BIG_OLAP_DOC");
> >> >>> for (GuidePostsInfo info :
> >> >>> table.getTableStats().getGuidePosts().values())
> >> >>> {
> >> >>>     for (byte[] gp : info.getGuidePosts()) {
> >> >>>         System.out.println(Bytes.toStringBinary(gp));
> >> >>>     }
> >> >>> }
> >> >>>
> >> >>> Also, try this query and let me know what it says:
> >> >>>     SELECT sum(GUIDE_POSTS_COUNT)
> >> >>>     FROM SYSTEM.STATS
> >> >>>     WHERE PHYSICAL_NAME = "PERF.BIG_OLAP_DOC";
> >> >>>
> >> >>> The UPDATE STATISTICS command timing out on the client prevented the
> >> >>> client-side to pull over the new stats until it was complete on the
> >> >>> server-side (that's why you only saw it later).
> >> >>>
> >> >>> Thanks,
> >> >>> James
> >> >>>
> >> >>> On Fri, Feb 27, 2015 at 5:42 PM, Gary Schulte
> >> >>> <gs...@marinsoftware.com> wrote:
> >> >>> > It appears I spoke too soon.  Presumably once the stats completed
> >> >>> > updating,
> >> >>> > I now get the same exception:
> >> >>> >
> >> >>> > java.lang.IndexOutOfBoundsException: end index (174) must not be
> >> >>> > less
> >> >>> > than
> >> >>> > start index (226)
> >> >>> > at
> >> >>> >
> >> >>> >
> >> >>> >
> com.google.common.base.Preconditions.checkPositionIndexes(Preconditions.java:388)
> >> >>> > at
> >> >>> >
> >> >>> >
> com.google.common.collect.ImmutableList.subList(ImmutableList.java:362)
> >> >>> > at
> >> >>> >
> >> >>> >
> com.google.common.collect.ImmutableList.subList(ImmutableList.java:62)
> >> >>> > at
> >> >>> >
> >> >>> >
> >> >>> >
> org.apache.phoenix.filter.SkipScanFilter.intersect(SkipScanFilter.java:291)
> >> >>> > at
> >> >>> >
> >> >>> >
> >> >>> >
> org.apache.phoenix.filter.SkipScanFilter.intersect(SkipScanFilter.java:177)
> >> >>> > at
> >> >>> >
> >> >>> >
> org.apache.phoenix.compile.ScanRanges.intersectScan(ScanRanges.java:316)
> >> >>> > at
> >> >>> >
> >> >>> >
> >> >>> >
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:464)
> >> >>> > at
> >> >>> >
> >> >>> >
> >> >>> >
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:394)
> >> >>> > at
> >> >>> >
> >> >>> >
> >> >>> >
> org.apache.phoenix.iterate.BaseResultIterators.<init>(BaseResultIterators.java:184)
> >> >>> > at
> >> >>> >
> >> >>> >
> >> >>> >
> org.apache.phoenix.iterate.ParallelIterators.<init>(ParallelIterators.java:54)
> >> >>> > at
> >> >>> >
> >> >>> >
> >> >>> >
> org.apache.phoenix.execute.AggregatePlan.newIterator(AggregatePlan.java:173)
> >> >>> > at
> >> >>> >
> >> >>> >
> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:227)
> >> >>> > at
> >> >>> >
> >> >>> >
> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:154)
> >> >>> > at
> >> >>> >
> >> >>> >
> >> >>> >
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:226)
> >> >>> > at
> >> >>> >
> >> >>> >
> >> >>> >
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:217)
> >> >>> > at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
> >> >>> > at
> >> >>> >
> >> >>> >
> >> >>> >
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:216)
> >> >>> > at
> >> >>> >
> >> >>> >
> >> >>> >
> org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1057)
> >> >>> > at sqlline.SqlLine$Commands.execute(SqlLine.java:3673)
> >> >>> > at sqlline.SqlLine$Commands.sql(SqlLine.java:3584)
> >> >>> > at sqlline.SqlLine.dispatch(SqlLine.java:821)
> >> >>> > at sqlline.SqlLine.begin(SqlLine.java:699)
> >> >>> > at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441)
> >> >>> > at sqlline.SqlLine.main(SqlLine.java:424)
> >> >>> >
> >> >>> >
> >> >>> > Still null for the guideposts though, both with PName.EMPTY_NAME
> and
> >> >>> > null
> >> >>> > for the PTableKey constructor.
> >> >>> >
> >> >>> > It certainly appears to be stats related.
> >> >>> >
> >> >>> > -Gary
> >> >>> >
> >> >>> > On Fri, Feb 27, 2015 at 4:06 PM, Gary Schulte
> >> >>> > <gs...@marinsoftware.com>
> >> >>> > wrote:
> >> >>> >>
> >> >>> >> I have the query timeout set too low, but I believe the stats
> >> >>> >> update
> >> >>> >> completed as I see related rows in the stats table.
> >> >>> >>
> >> >>> >> Both skip and in-list queries run fine - no exceptions.  Still
> null
> >> >>> >> for
> >> >>> >> the guideposts though - is it likely this is due to the timeout
> in
> >> >>> >> the
> >> >>> >> stats
> >> >>> >> update?
> >> >>> >>
> >> >>> >> -Gary
> >> >>> >>
> >> >>> >> On Fri, Feb 27, 2015 at 12:30 PM, James Taylor
> >> >>> >> <ja...@apache.org>
> >> >>> >> wrote:
> >> >>> >>>
> >> >>> >>> See inline. Thanks for your help on this one, Gary. It'd be good
> >> >>> >>> to
> >> >>> >>> get to the bottom of it so it doesn't bite you again.
> >> >>> >>>
> >> >>> >>> On Fri, Feb 27, 2015 at 11:13 AM, Gary Schulte
> >> >>> >>> <gs...@marinsoftware.com> wrote:
> >> >>> >>> > James,
> >> >>> >>> >
> >> >>> >>> > When I simply added the skip scan hint, I got the same
> exception
> >> >>> >>> > (even
> >> >>> >>> > with
> >> >>> >>> > device_type criteria removed) but the indexes in the exception
> >> >>> >>> > changed.
> >> >>> >>> > Interesting - I wouldn't have expected adding a skip scan hint
> >> >>> >>> > would
> >> >>> >>> > have
> >> >>> >>> > altered the plan, since it was already doing a skip scan.
> >> >>> >>>
> >> >>> >>> The hint tells Phoenix to keep using PK columns in the skip
> scan,
> >> >>> >>> rather than stopping at the first PK column that isn't being
> >> >>> >>> filtered
> >> >>> >>> on. We don't do this by default, because if the cardinality was
> >> >>> >>> very
> >> >>> >>> high, we wouldn't want to do this. Since our stats don't yet
> >> >>> >>> capture
> >> >>> >>> cardinality, we can't yet automatically do this.
> >> >>> >>>
> >> >>> >>> >
> >> >>> >>> > 1: current region boundaries, linked so as not to clutter the
> >> >>> >>> > list
> >> >>> >>> > with
> >> >>> >>> > hex
> >> >>> >>> > :  http://goo.gl/hFSzYJ
> >> >>> >>> >
> >> >>> >>> > 2: table stats/guideposts, it looks like there are/were none.
> >> >>> >>> > The
> >> >>> >>> > output
> >> >>> >>> > from the guidepost loop was :
> >> >>> >>> >
> >> >>> >>> > null
> >> >>> >>> >
> >> >>> >>> > (this was prior to deleting from system.stats)
> >> >>> >>> >
> >> >>> >>> >
> >> >>> >>> > 3: deleting system.stats appears to have resolved the
> exception
> >> >>> >>> > for
> >> >>> >>> > both the
> >> >>> >>> > explicit varchar inlist and the skip_scan hint.  Skip scanning
> >> >>> >>> > the
> >> >>> >>> > reduced
> >> >>> >>> > index space yields much faster results, ~5 seconds as opposed
> to
> >> >>> >>> > 27.
> >> >>> >>>
> >> >>> >>> Glad to hear it's faster with the hint forcing the skip scan
> >> >>> >>> across
> >> >>> >>> all your columns.
> >> >>> >>>
> >> >>> >>> I can't explain why deleting the stats resolved the issue,
> though,
> >> >>> >>> as
> >> >>> >>> I would have expected (2) to have returned the stats. FYI, it
> >> >>> >>> looks
> >> >>> >>> like the bug is in the code that intersects the guideposts with
> >> >>> >>> the
> >> >>> >>> region boundaries.
> >> >>> >>>
> >> >>> >>> >
> >> >>> >>> > Should I expect to rebuild stats often or is this more of an
> >> >>> >>> > error
> >> >>> >>> > case?
> >> >>> >>>
> >> >>> >>> Stats are rebuilt automatically when a major compaction occurs
> and
> >> >>> >>> are
> >> >>> >>> updated as splits happen. They can also be manually updated by
> >> >>> >>> running
> >> >>> >>> the following command:
> >> >>> >>>
> >> >>> >>> UPDATE STATISTICS PERF.BIG_OLAP_DOC
> >> >>> >>>
> >> >>> >>> For more info on stats, see
> >> >>> >>> http://phoenix.apache.org/update_statistics.html
> >> >>> >>>
> >> >>> >>> If you run this command, does the problem start to reoccur? If
> so,
> >> >>> >>> would you mind adding this command before running the loop to
> >> >>> >>> collect
> >> >>> >>> the guideposts and let me know if you see that stats output?
> >> >>> >>>
> >> >>> >>> >
> >> >>> >>> > Thanks again.
> >> >>> >>> >
> >> >>> >>> >
> >> >>> >>> >
> >> >>> >>> > On Thu, Feb 26, 2015 at 5:55 PM, James Taylor
> >> >>> >>> > <ja...@apache.org>
> >> >>> >>> > wrote:
> >> >>> >>> >>
> >> >>> >>> >> Gary,
> >> >>> >>> >> I'm not able to repro the issue - I filed PHOENIX-1690 to
> track
> >> >>> >>> >> it
> >> >>> >>> >> and
> >> >>> >>> >> attached my test case there. It looks related to the
> particular
> >> >>> >>> >> state
> >> >>> >>> >> the table is in wrt its region boundaries and current
> >> >>> >>> >> statistics,
> >> >>> >>> >> so
> >> >>> >>> >> I'll need the following additional information to try to help
> >> >>> >>> >> me
> >> >>> >>> >> repro
> >> >>> >>> >> this:
> >> >>> >>> >>
> >> >>> >>> >> 1) What are the current region boundaries of your table? You
> >> >>> >>> >> can
> >> >>> >>> >> get
> >> >>> >>> >> this programmatically through code like this:
> >> >>> >>> >>
> >> >>> >>> >>         Connection conn =
> DriverManager.getConnection(getUrl(),
> >> >>> >>> >> props);
> >> >>> >>> >>         List<HRegionLocation> splits =
> >> >>> >>> >>
> >> >>> >>> >>
> >> >>> >>> >>
> >> >>> >>> >>
> >> >>> >>> >>
> conn.unwrap(PhoenixConnection.class).getQueryServices().getAllTableRegions(Bytes.toBytes("PERF.BIG_OLAP_DOC"));
> >> >>> >>> >>         for (HRegionLocation split : splits) {
> >> >>> >>> >>
> >> >>> >>> >>
> >> >>> >>> >>
> >> >>> >>> >>
> >> >>> >>> >>
> System.out.println(Bytes.toStringBinary(split.getRegionInfo().getEndKey()));
> >> >>> >>> >>         }
> >> >>> >>> >>
> >> >>> >>> >> 2) What are the current stats for the table. You can get this
> >> >>> >>> >> by
> >> >>> >>> >> programmatically through code like this:
> >> >>> >>> >>
> >> >>> >>> >>         PTable table =
> >> >>> >>> >>
> >> >>> >>> >>
> >> >>> >>> >>
> conn.unwrap(PhoenixConnection.class).getMetaDataCache().getTable(new
> >> >>> >>> >> PTableKey(null, "PERF.BIG_OLAP_DOC"));
> >> >>> >>> >>         for (GuidePostsInfo info :
> >> >>> >>> >> table.getTableStats().getGuidePosts().values()) {
> >> >>> >>> >>             for (byte[] gp : info.getGuidePosts()) {
> >> >>> >>> >>                 System.out.println(Bytes.toStringBinary(gp));
> >> >>> >>> >>             }
> >> >>> >>> >>         }
> >> >>> >>> >>
> >> >>> >>> >> 3) If you can try after removing all rows from the
> SYSTEM.STATS
> >> >>> >>> >> table
> >> >>> >>> >> and let me know if the problem still occurs, that'd be
> helpful
> >> >>> >>> >> too.
> >> >>> >>> >> You can just do the following from sqlline: DELETE FROM
> >> >>> >>> >> SYSTEM.STATS
> >> >>> >>> >> and then exit sqlline, start it again, and rerun the original
> >> >>> >>> >> query.
> >> >>> >>> >>
> >> >>> >>> >> Thanks,
> >> >>> >>> >> James
> >> >>> >>> >>
> >> >>> >>> >> On Thu, Feb 26, 2015 at 10:52 AM, James Taylor
> >> >>> >>> >> <ja...@apache.org>
> >> >>> >>> >> wrote:
> >> >>> >>> >> > Gary,
> >> >>> >>> >> > One possible workaround. Can you try adding the SKIP_SCAN
> >> >>> >>> >> > hint to
> >> >>> >>> >> > your
> >> >>> >>> >> > query (instead of the AND device_type in
> >> >>> >>> >> > ('MOBILE','DESKTOP','OTHER','TABLET')), like this?
> >> >>> >>> >> >
> >> >>> >>> >> > SELECT /*+ SKIP_SCAN */ count(1) cnt,
> >> >>> >>> >> > ...
> >> >>> >>> >> >
> >> >>> >>> >> > Thanks,
> >> >>> >>> >> > James
> >> >>> >>> >> >
> >> >>> >>> >> > On Wed, Feb 25, 2015 at 10:16 AM, James Taylor
> >> >>> >>> >> > <ja...@apache.org>
> >> >>> >>> >> > wrote:
> >> >>> >>> >> >> Sounds like a bug. I'll try to repro on my end. Thanks for
> >> >>> >>> >> >> the
> >> >>> >>> >> >> details,
> >> >>> >>> >> >> Gary.
> >> >>> >>> >> >>
> >> >>> >>> >> >>     James
> >> >>> >>> >> >>
> >> >>> >>> >> >> On Tue, Feb 24, 2015 at 1:49 PM, Gary Schulte
> >> >>> >>> >> >> <gs...@marinsoftware.com> wrote:
> >> >>> >>> >> >>> On Tue, Feb 24, 2015 at 12:29 AM, James Taylor
> >> >>> >>> >> >>> <ja...@apache.org>
> >> >>> >>> >> >>> wrote:
> >> >>> >>> >> >>>>
> >> >>> >>> >> >>>> Based on your query plan, the skip scan is being done
> >> >>> >>> >> >>>> solely
> >> >>> >>> >> >>>> based on
> >> >>> >>> >> >>>> your
> >> >>> >>> >> >>>> salt bucket while the rest of the filtering is being
> done
> >> >>> >>> >> >>>> by a
> >> >>> >>> >> >>>> filter, which
> >> >>> >>> >> >>>> means that you're not filtering based on the leading
> part
> >> >>> >>> >> >>>> of
> >> >>> >>> >> >>>> your
> >> >>> >>> >> >>>> primary
> >> >>> >>> >> >>>> key. We'll know more once you post your schema, but if
> >> >>> >>> >> >>>> NETWORK,
> >> >>> >>> >> >>>> KEYWORD_ID
> >> >>> >>> >> >>>> and CUSTOMER_ID formed your primary key constraint, then
> >> >>> >>> >> >>>> the
> >> >>> >>> >> >>>> skip
> >> >>> >>> >> >>>> scan would
> >> >>> >>> >> >>>> work well.
> >> >>> >>> >> >>>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>> Thanks for your response James.  Sorry for the slow
> reply -
> >> >>> >>> >> >>> I
> >> >>> >>> >> >>> had
> >> >>> >>> >> >>> difficulty
> >> >>> >>> >> >>> finding the exact set of test queries I was using for
> >> >>> >>> >> >>> timings.
> >> >>> >>> >> >>>
> >> >>> >>> >> >>> The relevant portion of the olap doc schema is:
> >> >>> >>> >> >>>
> >> >>> >>> >> >>> create table PERF.BIG_OLAP_DOC (
> >> >>> >>> >> >>> client_id                           integer not null
> >> >>> >>> >> >>> ,customer_id                        integer
> >> >>> >>> >> >>> ,time_id                            integer not null
> >> >>> >>> >> >>> ,conversion_type_id                 integer not null
> >> >>> >>> >> >>> ,device_type                        varchar(16)
> >> >>> >>> >> >>> ,keyword_id                         bigint not null
> >> >>> >>> >> >>> ,creative_id                        bigint not null
> >> >>> >>> >> >>> ,placement_id                       bigint not null
> >> >>> >>> >> >>> ,product_target_id                  bigint not null
> >> >>> >>> >> >>> ,network                            varchar(7)
> >> >>> >>> >> >>> ,impressions                        decimal(18, 4)
> >> >>> >>> >> >>> ,publisher_clicks                   decimal(18, 4)
> >> >>> >>> >> >>> ,publisher_cost                     decimal(18, 4)
> >> >>> >>> >> >>> ,conversions                        decimal(18, 4)
> >> >>> >>> >> >>> ,revenue                            decimal(18, 4)
> >> >>> >>> >> >>>
> >> >>> >>> >> >>> [ ...additional metric and dimensional colums ... ]
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>     constraint perf_fact_pk primary key (client_id,
> >> >>> >>> >> >>> time_id,
> >> >>> >>> >> >>> conversion_type_id, device_type, keyword_id, creative_id,
> >> >>> >>> >> >>> placement_id,
> >> >>> >>> >> >>> product_target_id))SALT_BUCKETS=10;
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>> I am evaluating a 'stitch' case where results from an
> >> >>> >>> >> >>> external
> >> >>> >>> >> >>> system
> >> >>> >>> >> >>> are
> >> >>> >>> >> >>> injected either via table or (as in this case) an
> in-list.
> >> >>> >>> >> >>> An
> >> >>> >>> >> >>> example
> >> >>> >>> >> >>> of
> >> >>> >>> >> >>> one of these test agg queries I am using is:
> >> >>> >>> >> >>>
> >> >>> >>> >> >>> SELECT count(1) cnt,
> >> >>> >>> >> >>>        coalesce(SUM(impressions), 0.0) AS "impressions",
> >> >>> >>> >> >>>        coalesce(SUM(publisher_clicks), 0.0) AS
> >> >>> >>> >> >>> "pub_clicks",
> >> >>> >>> >> >>>        coalesce(SUM(publisher_cost), 0.0) AS "pub_cost",
> >> >>> >>> >> >>>        coalesce(SUM(conversions), 0.0) AS "conversions",
> >> >>> >>> >> >>>        coalesce(SUM(revenue), 0.0) AS "revenue"
> >> >>> >>> >> >>>   FROM perf.big_olap_doc
> >> >>> >>> >> >>>  WHERE time_id between 3000 and 3700
> >> >>> >>> >> >>>    AND network in ('SEARCH')
> >> >>> >>> >> >>>    AND conversion_type_id = 1
> >> >>> >>> >> >>>    AND client_id = 10724
> >> >>> >>> >> >>> --   AND device_type in
> >> >>> >>> >> >>> ('MOBILE','DESKTOP','OTHER','TABLET')
> >> >>> >>> >> >>>    AND keyword_id in (
> >> >>> >>> >> >>> 613214369, 613217307, 613247509, 613248897, 613250382,
> >> >>> >>> >> >>> 613250387,
> >> >>> >>> >> >>> 613252322,
> >> >>> >>> >> >>> 613260252, 613261753, 613261754, 613261759,
> >> >>> >>> >> >>> 613261770, 613261873, 613261884, 613261885, 613261888,
> >> >>> >>> >> >>> 613261889,
> >> >>> >>> >> >>> 613261892,
> >> >>> >>> >> >>> 613261897, 613261913, 613261919, 613261927,
> >> >>> >>> >> >>> 614496021, 843606367, 843606967, 843607021, 843607033,
> >> >>> >>> >> >>> 843607089,
> >> >>> >>> >> >>> 1038731600, 1038731672, 1038731673, 1038731675,
> >> >>> >>> >> >>> 1038731684, 1038731693, 1046990487, 1046990488,
> 1046990499,
> >> >>> >>> >> >>> 1046990505,
> >> >>> >>> >> >>> 1046990506, 1049724722, 1051109548, 1051311275,
> >> >>> >>> >> >>> 1051311904, 1060574377, 1060574395, 1060574506,
> 1060574562,
> >> >>> >>> >> >>> 1115915938,
> >> >>> >>> >> >>> 1115915939, 1115915941, 1116310571, 1367495544,
> >> >>> >>> >> >>> 1367495545, 1367497297, 1367497298, 1367497299,
> 1367497300,
> >> >>> >>> >> >>> 1367497303,
> >> >>> >>> >> >>> 1367497313, 1367497813, 1367497816, 1367497818,
> >> >>> >>> >> >>> 1367497821, 1367497822, 1367497823, 1624976423,
> 1624976451,
> >> >>> >>> >> >>> 1624976457,
> >> >>> >>> >> >>> 3275636061, 3275640505, 3275645765, 3275645807,
> >> >>> >>> >> >>> 3275649138, 3275651456, 3275651460, 3275651478,
> 3275651479,
> >> >>> >>> >> >>> 3275654566,
> >> >>> >>> >> >>> 3275654568, 3275654570, 3275654575, 3275659612,
> >> >>> >>> >> >>> 3275659616, 3275659620, 3275668880, 3275669693,
> 3275675627,
> >> >>> >>> >> >>> 3275675634,
> >> >>> >>> >> >>> 3275677479, 3275677504, 3275678855, 3275679524,
> >> >>> >>> >> >>> 3275679532, 3275680014, 3275682307, 3275682308,
> 3275682309,
> >> >>> >>> >> >>> 3275682310,
> >> >>> >>> >> >>> 3275682420, 3275682423, 3275682436, 3275682448,
> >> >>> >>> >> >>> 3275682460, 3275682462, 3275682474, 3275684831,
> 3275688903,
> >> >>> >>> >> >>> 3275694023,
> >> >>> >>> >> >>> 3275694025, 3275694027, 3275695054, 3275695056,
> >> >>> >>> >> >>> 3275695062, 3275699512, 3275699514, 3275699518,
> 3275701682,
> >> >>> >>> >> >>> 3275701683,
> >> >>> >>> >> >>> 3275701685, 3275701688, 3275703633, 3275703634,
> >> >>> >>> >> >>> 3275703635, 3275703636, 3275703638, 3275703639,
> 3275704860,
> >> >>> >>> >> >>> 3275704861,
> >> >>> >>> >> >>> 3275764577, 3275797149, 3275798566, 3275798567,
> >> >>> >>> >> >>> 3275798568, 3275798592, 3275931147, 3275942728,
> 3275945337,
> >> >>> >>> >> >>> 3275945338,
> >> >>> >>> >> >>> 3275945339, 3275945340, 3275945342, 3275945344,
> >> >>> >>> >> >>> 3275946319, 3275946322, 3275946324, 3275946643,
> 3275949495,
> >> >>> >>> >> >>> 3275949498,
> >> >>> >>> >> >>> 3275949500, 3275950250, 3275955128, 3275955129,
> >> >>> >>> >> >>> 3275955130, 3427017435, 3427017450, 3438304254,
> 3438304257,
> >> >>> >>> >> >>> 3447068169,
> >> >>> >>> >> >>> 3505227849, 3505227890, 3505556908, 3506351285,
> >> >>> >>> >> >>> 3506351389, 3506351398, 3506351468, 3510037138,
> 3510038610,
> >> >>> >>> >> >>> 3545590644,
> >> >>> >>> >> >>> 3545594378, 3545595073, 3545595318, 3545595506,
> >> >>> >>> >> >>> 3545597841, 3545598818, 3545599658, 3545599663,
> 3545601215,
> >> >>> >>> >> >>> 3556080898,
> >> >>> >>> >> >>> 3556080980, 3556080999, 3556081323, 3565122663,
> >> >>> >>> >> >>> 3565122679, 3565122801, 3565122858, 3565122908,
> 3565122929,
> >> >>> >>> >> >>> 3565122952,
> >> >>> >>> >> >>> 3565122984, 3565123028, 3565123047, 3565123048,
> >> >>> >>> >> >>> 3565123203, 3565123230, 3949988054, 3949988056,
> 3949988070,
> >> >>> >>> >> >>> 3972992248,
> >> >>> >>> >> >>> 3972992252, 3972992254, 3972992257, 3972992263,
> >> >>> >>> >> >>> 3972992267, 3972992268, 3972992269, 3972992270,
> 3972992274,
> >> >>> >>> >> >>> 3972992275,
> >> >>> >>> >> >>> 3972992277, 3972992281, 3972992293, 3972992298,
> >> >>> >>> >> >>> 3972992299, 3972992305, 3972992307, 3972992313,
> 3972992316,
> >> >>> >>> >> >>> 3972992322,
> >> >>> >>> >> >>> 3972992338, 3978471261, 3978471272, 4266318185,
> >> >>> >>> >> >>> 4298107404, 4308853119, 4308853123, 4308853500,
> 4451174646,
> >> >>> >>> >> >>> 4451174656,
> >> >>> >>> >> >>> 4451174701, 4569827278, 4569827284, 4569827287,
> >> >>> >>> >> >>> 4569827379, 4569827523, 4569827524, 4896589676,
> 4979049725,
> >> >>> >>> >> >>> 5054587609,
> >> >>> >>> >> >>> 5136433884, 5362640372, 5393109964, 5393405364,
> >> >>> >>> >> >>> 5393405365, 5393405620, 5393405625, 5393405675,
> 5393405677,
> >> >>> >>> >> >>> 5393405858,
> >> >>> >>> >> >>> 5393405970)
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>> Reading your interpretation of the skip scan, I see that
> >> >>> >>> >> >>> the
> >> >>> >>> >> >>> plan
> >> >>> >>> >> >>> is
> >> >>> >>> >> >>> indicating it is only using the salt and the first three
> >> >>> >>> >> >>> columns
> >> >>> >>> >> >>> of
> >> >>> >>> >> >>> the
> >> >>> >>> >> >>> index, client_id, and time_id and conversion_type.  I
> >> >>> >>> >> >>> hadn't
> >> >>> >>> >> >>> considered the
> >> >>> >>> >> >>> salt - that bit of detail in the plan makes more sense to
> >> >>> >>> >> >>> me
> >> >>> >>> >> >>> now.
> >> >>> >>> >> >>> It
> >> >>> >>> >> >>> looks
> >> >>> >>> >> >>> now like the lackluster performance for higher
> cardinality
> >> >>> >>> >> >>> aggregations is
> >> >>> >>> >> >>> related to scanning a much larger portion of the key
> space.
> >> >>> >>> >> >>> For
> >> >>> >>> >> >>> aggregations where I am not relying on filtering, I am
> >> >>> >>> >> >>> seeing
> >> >>> >>> >> >>> much
> >> >>> >>> >> >>> better
> >> >>> >>> >> >>> performance.
> >> >>> >>> >> >>>
> >> >>> >>> >> >>> So to tune this particular stitch case / skip scan, it
> >> >>> >>> >> >>> looks
> >> >>> >>> >> >>> like
> >> >>> >>> >> >>> I
> >> >>> >>> >> >>> need to
> >> >>> >>> >> >>> get the 4th index column into the criteria.  There are
> only
> >> >>> >>> >> >>> four
> >> >>> >>> >> >>> distinct
> >> >>> >>> >> >>> values in the fourth index column (these can/should
> >> >>> >>> >> >>> probably be
> >> >>> >>> >> >>> something
> >> >>> >>> >> >>> other than varchar, but this is what I have loaded
> >> >>> >>> >> >>> currently).
> >> >>> >>> >> >>> In
> >> >>> >>> >> >>> order to
> >> >>> >>> >> >>> use the keyword_id portion of the index I tried
> explicitly
> >> >>> >>> >> >>> specifying
> >> >>> >>> >> >>> all
> >> >>> >>> >> >>> device_types via in-list (the commented portion of the
> >> >>> >>> >> >>> query
> >> >>> >>> >> >>> above),
> >> >>> >>> >> >>> but I
> >> >>> >>> >> >>> get a peculiar error:
> >> >>> >>> >> >>>
> >> >>> >>> >> >>> java.lang.IndexOutOfBoundsException: end index (1) must
> not
> >> >>> >>> >> >>> be
> >> >>> >>> >> >>> less
> >> >>> >>> >> >>> than
> >> >>> >>> >> >>> start index (2)
> >> >>> >>> >> >>> at
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> com.google.common.base.Preconditions.checkPositionIndexes(Preconditions.java:388)
> >> >>> >>> >> >>> at
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> com.google.common.collect.ImmutableList.subList(ImmutableList.java:362)
> >> >>> >>> >> >>> at
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> com.google.common.collect.ImmutableList.subList(ImmutableList.java:62)
> >> >>> >>> >> >>> at
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> org.apache.phoenix.filter.SkipScanFilter.intersect(SkipScanFilter.java:291)
> >> >>> >>> >> >>> at
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> org.apache.phoenix.filter.SkipScanFilter.intersect(SkipScanFilter.java:177)
> >> >>> >>> >> >>> at
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> org.apache.phoenix.compile.ScanRanges.intersectScan(ScanRanges.java:316)
> >> >>> >>> >> >>> at
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:464)
> >> >>> >>> >> >>> at
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:394)
> >> >>> >>> >> >>> at
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> org.apache.phoenix.iterate.BaseResultIterators.<init>(BaseResultIterators.java:184)
> >> >>> >>> >> >>> at
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> org.apache.phoenix.iterate.ParallelIterators.<init>(ParallelIterators.java:54)
> >> >>> >>> >> >>> at
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> org.apache.phoenix.execute.AggregatePlan.newIterator(AggregatePlan.java:173)
> >> >>> >>> >> >>> at
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:227)
> >> >>> >>> >> >>> at
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:154)
> >> >>> >>> >> >>> at
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:226)
> >> >>> >>> >> >>> at
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:217)
> >> >>> >>> >> >>> at
> >> >>> >>> >> >>>
> org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
> >> >>> >>> >> >>> at
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:216)
> >> >>> >>> >> >>> at
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1057)
> >> >>> >>> >> >>> at sqlline.SqlLine$Commands.execute(SqlLine.java:3673)
> >> >>> >>> >> >>> at sqlline.SqlLine$Commands.sql(SqlLine.java:3584)
> >> >>> >>> >> >>> at sqlline.SqlLine.dispatch(SqlLine.java:821)
> >> >>> >>> >> >>> at sqlline.SqlLine.begin(SqlLine.java:699)
> >> >>> >>> >> >>> at
> >> >>> >>> >> >>>
> sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441)
> >> >>> >>> >> >>> at sqlline.SqlLine.main(SqlLine.java:424)
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>> I thought perhaps I was hitting an upper limit on the
> >> >>> >>> >> >>> number of
> >> >>> >>> >> >>> elements in
> >> >>> >>> >> >>> an in-list for a skip scan, and so tried removing the 250
> >> >>> >>> >> >>> element
> >> >>> >>> >> >>> keyword
> >> >>> >>> >> >>> in-list entirely and leaving only the device_type
> in-list,
> >> >>> >>> >> >>> but
> >> >>> >>> >> >>> I
> >> >>> >>> >> >>> still
> >> >>> >>> >> >>> get
> >> >>> >>> >> >>> the same error.  It happens immediately, even for an
> >> >>> >>> >> >>> explain,
> >> >>> >>> >> >>> so I
> >> >>> >>> >> >>> presume
> >> >>> >>> >> >>> this is a query parsing problem.  Is there a bug or
> >> >>> >>> >> >>> limitation
> >> >>> >>> >> >>> of
> >> >>> >>> >> >>> skip
> >> >>> >>> >> >>> scans
> >> >>> >>> >> >>> and/or sub lists involving varchar?
> >> >>> >>> >> >>>
> >> >>> >>> >> >>> Thx
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >> >>>
> >> >>> >>> >
> >> >>> >>> >
> >> >>> >>
> >> >>> >>
> >> >>> >
> >> >>
> >> >>
> >
> >
>

Re: high cardinality aggregation query performance

Posted by James Taylor <ja...@apache.org>.
Thanks, Gary. Please let me know once you've tried to simplified patch
and then I'll get this checked in and into the next point release
(4.3.1).

On Mon, Mar 2, 2015 at 12:53 PM, Gary Schulte
<gs...@marinsoftware.com> wrote:
> I patched using the original (not simplified) patch against 4.3.1 and it
> appears to have fixed the issue.  I updated stats and waited for it to
> complete and can no longer reproduce the problem.  I will give the new patch
> a whirl also just for grins.
>
> Thanks,
>
> Gary
>
> On Sat, Feb 28, 2015 at 7:57 PM, James Taylor <ja...@apache.org>
> wrote:
>>
>> Gary,
>> I've got a patch available on PHOENIX-1690 that fixes the issue for my
>> tests. Would you mind giving it a whirl?
>> Thanks,
>> James
>>
>> On Fri, Feb 27, 2015 at 6:40 PM, James Taylor <ja...@apache.org>
>> wrote:
>> > Thanks, Gary. That should be enough for me to repro (though it's a lot
>> > of data!).
>> >
>> > I've always had to hack up the hbase shell script for remote debugging
>> > and then it seems to work.
>> >
>> > On Fri, Feb 27, 2015 at 6:37 PM, Gary Schulte
>> > <gs...@marinsoftware.com> wrote:
>> >> 509 guideposts according to system.stats, getting the table via runtime
>> >> seems to work, guide posts, here: http://goo.gl/jvcFec
>> >>
>> >>
>> >> As an aside, I am having issues getting a connection to phoenix/hbase
>> >> remotely (so I can debug from my IDE).  I have all the ports open that
>> >> I
>> >> think would play a part - am I missing anything?
>> >> 2181,49255,60000,60010,60020,60030,8080,8085,9090, and 9095.
>> >> Connections
>> >> from remote just hang and I never get an error or a stack trace.
>> >>
>> >> Thx
>> >>
>> >> -Gary
>> >>
>> >>
>> >> On Fri, Feb 27, 2015 at 5:53 PM, James Taylor <ja...@apache.org>
>> >> wrote:
>> >>>
>> >>> Try this code snippet to see if we can force the stats to be send
>> >>> over:
>> >>>
>> >>> conn.unwrap(PhoenixConnection.class).getQueryServices().clearCache();
>> >>> PTable table = PhoenixRuntime.getTable(conn, "PERF.BIG_OLAP_DOC");
>> >>> for (GuidePostsInfo info :
>> >>> table.getTableStats().getGuidePosts().values())
>> >>> {
>> >>>     for (byte[] gp : info.getGuidePosts()) {
>> >>>         System.out.println(Bytes.toStringBinary(gp));
>> >>>     }
>> >>> }
>> >>>
>> >>> Also, try this query and let me know what it says:
>> >>>     SELECT sum(GUIDE_POSTS_COUNT)
>> >>>     FROM SYSTEM.STATS
>> >>>     WHERE PHYSICAL_NAME = "PERF.BIG_OLAP_DOC";
>> >>>
>> >>> The UPDATE STATISTICS command timing out on the client prevented the
>> >>> client-side to pull over the new stats until it was complete on the
>> >>> server-side (that's why you only saw it later).
>> >>>
>> >>> Thanks,
>> >>> James
>> >>>
>> >>> On Fri, Feb 27, 2015 at 5:42 PM, Gary Schulte
>> >>> <gs...@marinsoftware.com> wrote:
>> >>> > It appears I spoke too soon.  Presumably once the stats completed
>> >>> > updating,
>> >>> > I now get the same exception:
>> >>> >
>> >>> > java.lang.IndexOutOfBoundsException: end index (174) must not be
>> >>> > less
>> >>> > than
>> >>> > start index (226)
>> >>> > at
>> >>> >
>> >>> >
>> >>> > com.google.common.base.Preconditions.checkPositionIndexes(Preconditions.java:388)
>> >>> > at
>> >>> >
>> >>> > com.google.common.collect.ImmutableList.subList(ImmutableList.java:362)
>> >>> > at
>> >>> >
>> >>> > com.google.common.collect.ImmutableList.subList(ImmutableList.java:62)
>> >>> > at
>> >>> >
>> >>> >
>> >>> > org.apache.phoenix.filter.SkipScanFilter.intersect(SkipScanFilter.java:291)
>> >>> > at
>> >>> >
>> >>> >
>> >>> > org.apache.phoenix.filter.SkipScanFilter.intersect(SkipScanFilter.java:177)
>> >>> > at
>> >>> >
>> >>> > org.apache.phoenix.compile.ScanRanges.intersectScan(ScanRanges.java:316)
>> >>> > at
>> >>> >
>> >>> >
>> >>> > org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:464)
>> >>> > at
>> >>> >
>> >>> >
>> >>> > org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:394)
>> >>> > at
>> >>> >
>> >>> >
>> >>> > org.apache.phoenix.iterate.BaseResultIterators.<init>(BaseResultIterators.java:184)
>> >>> > at
>> >>> >
>> >>> >
>> >>> > org.apache.phoenix.iterate.ParallelIterators.<init>(ParallelIterators.java:54)
>> >>> > at
>> >>> >
>> >>> >
>> >>> > org.apache.phoenix.execute.AggregatePlan.newIterator(AggregatePlan.java:173)
>> >>> > at
>> >>> >
>> >>> > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:227)
>> >>> > at
>> >>> >
>> >>> > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:154)
>> >>> > at
>> >>> >
>> >>> >
>> >>> > org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:226)
>> >>> > at
>> >>> >
>> >>> >
>> >>> > org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:217)
>> >>> > at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>> >>> > at
>> >>> >
>> >>> >
>> >>> > org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:216)
>> >>> > at
>> >>> >
>> >>> >
>> >>> > org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1057)
>> >>> > at sqlline.SqlLine$Commands.execute(SqlLine.java:3673)
>> >>> > at sqlline.SqlLine$Commands.sql(SqlLine.java:3584)
>> >>> > at sqlline.SqlLine.dispatch(SqlLine.java:821)
>> >>> > at sqlline.SqlLine.begin(SqlLine.java:699)
>> >>> > at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441)
>> >>> > at sqlline.SqlLine.main(SqlLine.java:424)
>> >>> >
>> >>> >
>> >>> > Still null for the guideposts though, both with PName.EMPTY_NAME and
>> >>> > null
>> >>> > for the PTableKey constructor.
>> >>> >
>> >>> > It certainly appears to be stats related.
>> >>> >
>> >>> > -Gary
>> >>> >
>> >>> > On Fri, Feb 27, 2015 at 4:06 PM, Gary Schulte
>> >>> > <gs...@marinsoftware.com>
>> >>> > wrote:
>> >>> >>
>> >>> >> I have the query timeout set too low, but I believe the stats
>> >>> >> update
>> >>> >> completed as I see related rows in the stats table.
>> >>> >>
>> >>> >> Both skip and in-list queries run fine - no exceptions.  Still null
>> >>> >> for
>> >>> >> the guideposts though - is it likely this is due to the timeout in
>> >>> >> the
>> >>> >> stats
>> >>> >> update?
>> >>> >>
>> >>> >> -Gary
>> >>> >>
>> >>> >> On Fri, Feb 27, 2015 at 12:30 PM, James Taylor
>> >>> >> <ja...@apache.org>
>> >>> >> wrote:
>> >>> >>>
>> >>> >>> See inline. Thanks for your help on this one, Gary. It'd be good
>> >>> >>> to
>> >>> >>> get to the bottom of it so it doesn't bite you again.
>> >>> >>>
>> >>> >>> On Fri, Feb 27, 2015 at 11:13 AM, Gary Schulte
>> >>> >>> <gs...@marinsoftware.com> wrote:
>> >>> >>> > James,
>> >>> >>> >
>> >>> >>> > When I simply added the skip scan hint, I got the same exception
>> >>> >>> > (even
>> >>> >>> > with
>> >>> >>> > device_type criteria removed) but the indexes in the exception
>> >>> >>> > changed.
>> >>> >>> > Interesting - I wouldn't have expected adding a skip scan hint
>> >>> >>> > would
>> >>> >>> > have
>> >>> >>> > altered the plan, since it was already doing a skip scan.
>> >>> >>>
>> >>> >>> The hint tells Phoenix to keep using PK columns in the skip scan,
>> >>> >>> rather than stopping at the first PK column that isn't being
>> >>> >>> filtered
>> >>> >>> on. We don't do this by default, because if the cardinality was
>> >>> >>> very
>> >>> >>> high, we wouldn't want to do this. Since our stats don't yet
>> >>> >>> capture
>> >>> >>> cardinality, we can't yet automatically do this.
>> >>> >>>
>> >>> >>> >
>> >>> >>> > 1: current region boundaries, linked so as not to clutter the
>> >>> >>> > list
>> >>> >>> > with
>> >>> >>> > hex
>> >>> >>> > :  http://goo.gl/hFSzYJ
>> >>> >>> >
>> >>> >>> > 2: table stats/guideposts, it looks like there are/were none.
>> >>> >>> > The
>> >>> >>> > output
>> >>> >>> > from the guidepost loop was :
>> >>> >>> >
>> >>> >>> > null
>> >>> >>> >
>> >>> >>> > (this was prior to deleting from system.stats)
>> >>> >>> >
>> >>> >>> >
>> >>> >>> > 3: deleting system.stats appears to have resolved the exception
>> >>> >>> > for
>> >>> >>> > both the
>> >>> >>> > explicit varchar inlist and the skip_scan hint.  Skip scanning
>> >>> >>> > the
>> >>> >>> > reduced
>> >>> >>> > index space yields much faster results, ~5 seconds as opposed to
>> >>> >>> > 27.
>> >>> >>>
>> >>> >>> Glad to hear it's faster with the hint forcing the skip scan
>> >>> >>> across
>> >>> >>> all your columns.
>> >>> >>>
>> >>> >>> I can't explain why deleting the stats resolved the issue, though,
>> >>> >>> as
>> >>> >>> I would have expected (2) to have returned the stats. FYI, it
>> >>> >>> looks
>> >>> >>> like the bug is in the code that intersects the guideposts with
>> >>> >>> the
>> >>> >>> region boundaries.
>> >>> >>>
>> >>> >>> >
>> >>> >>> > Should I expect to rebuild stats often or is this more of an
>> >>> >>> > error
>> >>> >>> > case?
>> >>> >>>
>> >>> >>> Stats are rebuilt automatically when a major compaction occurs and
>> >>> >>> are
>> >>> >>> updated as splits happen. They can also be manually updated by
>> >>> >>> running
>> >>> >>> the following command:
>> >>> >>>
>> >>> >>> UPDATE STATISTICS PERF.BIG_OLAP_DOC
>> >>> >>>
>> >>> >>> For more info on stats, see
>> >>> >>> http://phoenix.apache.org/update_statistics.html
>> >>> >>>
>> >>> >>> If you run this command, does the problem start to reoccur? If so,
>> >>> >>> would you mind adding this command before running the loop to
>> >>> >>> collect
>> >>> >>> the guideposts and let me know if you see that stats output?
>> >>> >>>
>> >>> >>> >
>> >>> >>> > Thanks again.
>> >>> >>> >
>> >>> >>> >
>> >>> >>> >
>> >>> >>> > On Thu, Feb 26, 2015 at 5:55 PM, James Taylor
>> >>> >>> > <ja...@apache.org>
>> >>> >>> > wrote:
>> >>> >>> >>
>> >>> >>> >> Gary,
>> >>> >>> >> I'm not able to repro the issue - I filed PHOENIX-1690 to track
>> >>> >>> >> it
>> >>> >>> >> and
>> >>> >>> >> attached my test case there. It looks related to the particular
>> >>> >>> >> state
>> >>> >>> >> the table is in wrt its region boundaries and current
>> >>> >>> >> statistics,
>> >>> >>> >> so
>> >>> >>> >> I'll need the following additional information to try to help
>> >>> >>> >> me
>> >>> >>> >> repro
>> >>> >>> >> this:
>> >>> >>> >>
>> >>> >>> >> 1) What are the current region boundaries of your table? You
>> >>> >>> >> can
>> >>> >>> >> get
>> >>> >>> >> this programmatically through code like this:
>> >>> >>> >>
>> >>> >>> >>         Connection conn = DriverManager.getConnection(getUrl(),
>> >>> >>> >> props);
>> >>> >>> >>         List<HRegionLocation> splits =
>> >>> >>> >>
>> >>> >>> >>
>> >>> >>> >>
>> >>> >>> >>
>> >>> >>> >> conn.unwrap(PhoenixConnection.class).getQueryServices().getAllTableRegions(Bytes.toBytes("PERF.BIG_OLAP_DOC"));
>> >>> >>> >>         for (HRegionLocation split : splits) {
>> >>> >>> >>
>> >>> >>> >>
>> >>> >>> >>
>> >>> >>> >>
>> >>> >>> >> System.out.println(Bytes.toStringBinary(split.getRegionInfo().getEndKey()));
>> >>> >>> >>         }
>> >>> >>> >>
>> >>> >>> >> 2) What are the current stats for the table. You can get this
>> >>> >>> >> by
>> >>> >>> >> programmatically through code like this:
>> >>> >>> >>
>> >>> >>> >>         PTable table =
>> >>> >>> >>
>> >>> >>> >>
>> >>> >>> >> conn.unwrap(PhoenixConnection.class).getMetaDataCache().getTable(new
>> >>> >>> >> PTableKey(null, "PERF.BIG_OLAP_DOC"));
>> >>> >>> >>         for (GuidePostsInfo info :
>> >>> >>> >> table.getTableStats().getGuidePosts().values()) {
>> >>> >>> >>             for (byte[] gp : info.getGuidePosts()) {
>> >>> >>> >>                 System.out.println(Bytes.toStringBinary(gp));
>> >>> >>> >>             }
>> >>> >>> >>         }
>> >>> >>> >>
>> >>> >>> >> 3) If you can try after removing all rows from the SYSTEM.STATS
>> >>> >>> >> table
>> >>> >>> >> and let me know if the problem still occurs, that'd be helpful
>> >>> >>> >> too.
>> >>> >>> >> You can just do the following from sqlline: DELETE FROM
>> >>> >>> >> SYSTEM.STATS
>> >>> >>> >> and then exit sqlline, start it again, and rerun the original
>> >>> >>> >> query.
>> >>> >>> >>
>> >>> >>> >> Thanks,
>> >>> >>> >> James
>> >>> >>> >>
>> >>> >>> >> On Thu, Feb 26, 2015 at 10:52 AM, James Taylor
>> >>> >>> >> <ja...@apache.org>
>> >>> >>> >> wrote:
>> >>> >>> >> > Gary,
>> >>> >>> >> > One possible workaround. Can you try adding the SKIP_SCAN
>> >>> >>> >> > hint to
>> >>> >>> >> > your
>> >>> >>> >> > query (instead of the AND device_type in
>> >>> >>> >> > ('MOBILE','DESKTOP','OTHER','TABLET')), like this?
>> >>> >>> >> >
>> >>> >>> >> > SELECT /*+ SKIP_SCAN */ count(1) cnt,
>> >>> >>> >> > ...
>> >>> >>> >> >
>> >>> >>> >> > Thanks,
>> >>> >>> >> > James
>> >>> >>> >> >
>> >>> >>> >> > On Wed, Feb 25, 2015 at 10:16 AM, James Taylor
>> >>> >>> >> > <ja...@apache.org>
>> >>> >>> >> > wrote:
>> >>> >>> >> >> Sounds like a bug. I'll try to repro on my end. Thanks for
>> >>> >>> >> >> the
>> >>> >>> >> >> details,
>> >>> >>> >> >> Gary.
>> >>> >>> >> >>
>> >>> >>> >> >>     James
>> >>> >>> >> >>
>> >>> >>> >> >> On Tue, Feb 24, 2015 at 1:49 PM, Gary Schulte
>> >>> >>> >> >> <gs...@marinsoftware.com> wrote:
>> >>> >>> >> >>> On Tue, Feb 24, 2015 at 12:29 AM, James Taylor
>> >>> >>> >> >>> <ja...@apache.org>
>> >>> >>> >> >>> wrote:
>> >>> >>> >> >>>>
>> >>> >>> >> >>>> Based on your query plan, the skip scan is being done
>> >>> >>> >> >>>> solely
>> >>> >>> >> >>>> based on
>> >>> >>> >> >>>> your
>> >>> >>> >> >>>> salt bucket while the rest of the filtering is being done
>> >>> >>> >> >>>> by a
>> >>> >>> >> >>>> filter, which
>> >>> >>> >> >>>> means that you're not filtering based on the leading part
>> >>> >>> >> >>>> of
>> >>> >>> >> >>>> your
>> >>> >>> >> >>>> primary
>> >>> >>> >> >>>> key. We'll know more once you post your schema, but if
>> >>> >>> >> >>>> NETWORK,
>> >>> >>> >> >>>> KEYWORD_ID
>> >>> >>> >> >>>> and CUSTOMER_ID formed your primary key constraint, then
>> >>> >>> >> >>>> the
>> >>> >>> >> >>>> skip
>> >>> >>> >> >>>> scan would
>> >>> >>> >> >>>> work well.
>> >>> >>> >> >>>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> Thanks for your response James.  Sorry for the slow reply -
>> >>> >>> >> >>> I
>> >>> >>> >> >>> had
>> >>> >>> >> >>> difficulty
>> >>> >>> >> >>> finding the exact set of test queries I was using for
>> >>> >>> >> >>> timings.
>> >>> >>> >> >>>
>> >>> >>> >> >>> The relevant portion of the olap doc schema is:
>> >>> >>> >> >>>
>> >>> >>> >> >>> create table PERF.BIG_OLAP_DOC (
>> >>> >>> >> >>> client_id                           integer not null
>> >>> >>> >> >>> ,customer_id                        integer
>> >>> >>> >> >>> ,time_id                            integer not null
>> >>> >>> >> >>> ,conversion_type_id                 integer not null
>> >>> >>> >> >>> ,device_type                        varchar(16)
>> >>> >>> >> >>> ,keyword_id                         bigint not null
>> >>> >>> >> >>> ,creative_id                        bigint not null
>> >>> >>> >> >>> ,placement_id                       bigint not null
>> >>> >>> >> >>> ,product_target_id                  bigint not null
>> >>> >>> >> >>> ,network                            varchar(7)
>> >>> >>> >> >>> ,impressions                        decimal(18, 4)
>> >>> >>> >> >>> ,publisher_clicks                   decimal(18, 4)
>> >>> >>> >> >>> ,publisher_cost                     decimal(18, 4)
>> >>> >>> >> >>> ,conversions                        decimal(18, 4)
>> >>> >>> >> >>> ,revenue                            decimal(18, 4)
>> >>> >>> >> >>>
>> >>> >>> >> >>> [ ...additional metric and dimensional colums ... ]
>> >>> >>> >> >>>
>> >>> >>> >> >>>     constraint perf_fact_pk primary key (client_id,
>> >>> >>> >> >>> time_id,
>> >>> >>> >> >>> conversion_type_id, device_type, keyword_id, creative_id,
>> >>> >>> >> >>> placement_id,
>> >>> >>> >> >>> product_target_id))SALT_BUCKETS=10;
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> I am evaluating a 'stitch' case where results from an
>> >>> >>> >> >>> external
>> >>> >>> >> >>> system
>> >>> >>> >> >>> are
>> >>> >>> >> >>> injected either via table or (as in this case) an in-list.
>> >>> >>> >> >>> An
>> >>> >>> >> >>> example
>> >>> >>> >> >>> of
>> >>> >>> >> >>> one of these test agg queries I am using is:
>> >>> >>> >> >>>
>> >>> >>> >> >>> SELECT count(1) cnt,
>> >>> >>> >> >>>        coalesce(SUM(impressions), 0.0) AS "impressions",
>> >>> >>> >> >>>        coalesce(SUM(publisher_clicks), 0.0) AS
>> >>> >>> >> >>> "pub_clicks",
>> >>> >>> >> >>>        coalesce(SUM(publisher_cost), 0.0) AS "pub_cost",
>> >>> >>> >> >>>        coalesce(SUM(conversions), 0.0) AS "conversions",
>> >>> >>> >> >>>        coalesce(SUM(revenue), 0.0) AS "revenue"
>> >>> >>> >> >>>   FROM perf.big_olap_doc
>> >>> >>> >> >>>  WHERE time_id between 3000 and 3700
>> >>> >>> >> >>>    AND network in ('SEARCH')
>> >>> >>> >> >>>    AND conversion_type_id = 1
>> >>> >>> >> >>>    AND client_id = 10724
>> >>> >>> >> >>> --   AND device_type in
>> >>> >>> >> >>> ('MOBILE','DESKTOP','OTHER','TABLET')
>> >>> >>> >> >>>    AND keyword_id in (
>> >>> >>> >> >>> 613214369, 613217307, 613247509, 613248897, 613250382,
>> >>> >>> >> >>> 613250387,
>> >>> >>> >> >>> 613252322,
>> >>> >>> >> >>> 613260252, 613261753, 613261754, 613261759,
>> >>> >>> >> >>> 613261770, 613261873, 613261884, 613261885, 613261888,
>> >>> >>> >> >>> 613261889,
>> >>> >>> >> >>> 613261892,
>> >>> >>> >> >>> 613261897, 613261913, 613261919, 613261927,
>> >>> >>> >> >>> 614496021, 843606367, 843606967, 843607021, 843607033,
>> >>> >>> >> >>> 843607089,
>> >>> >>> >> >>> 1038731600, 1038731672, 1038731673, 1038731675,
>> >>> >>> >> >>> 1038731684, 1038731693, 1046990487, 1046990488, 1046990499,
>> >>> >>> >> >>> 1046990505,
>> >>> >>> >> >>> 1046990506, 1049724722, 1051109548, 1051311275,
>> >>> >>> >> >>> 1051311904, 1060574377, 1060574395, 1060574506, 1060574562,
>> >>> >>> >> >>> 1115915938,
>> >>> >>> >> >>> 1115915939, 1115915941, 1116310571, 1367495544,
>> >>> >>> >> >>> 1367495545, 1367497297, 1367497298, 1367497299, 1367497300,
>> >>> >>> >> >>> 1367497303,
>> >>> >>> >> >>> 1367497313, 1367497813, 1367497816, 1367497818,
>> >>> >>> >> >>> 1367497821, 1367497822, 1367497823, 1624976423, 1624976451,
>> >>> >>> >> >>> 1624976457,
>> >>> >>> >> >>> 3275636061, 3275640505, 3275645765, 3275645807,
>> >>> >>> >> >>> 3275649138, 3275651456, 3275651460, 3275651478, 3275651479,
>> >>> >>> >> >>> 3275654566,
>> >>> >>> >> >>> 3275654568, 3275654570, 3275654575, 3275659612,
>> >>> >>> >> >>> 3275659616, 3275659620, 3275668880, 3275669693, 3275675627,
>> >>> >>> >> >>> 3275675634,
>> >>> >>> >> >>> 3275677479, 3275677504, 3275678855, 3275679524,
>> >>> >>> >> >>> 3275679532, 3275680014, 3275682307, 3275682308, 3275682309,
>> >>> >>> >> >>> 3275682310,
>> >>> >>> >> >>> 3275682420, 3275682423, 3275682436, 3275682448,
>> >>> >>> >> >>> 3275682460, 3275682462, 3275682474, 3275684831, 3275688903,
>> >>> >>> >> >>> 3275694023,
>> >>> >>> >> >>> 3275694025, 3275694027, 3275695054, 3275695056,
>> >>> >>> >> >>> 3275695062, 3275699512, 3275699514, 3275699518, 3275701682,
>> >>> >>> >> >>> 3275701683,
>> >>> >>> >> >>> 3275701685, 3275701688, 3275703633, 3275703634,
>> >>> >>> >> >>> 3275703635, 3275703636, 3275703638, 3275703639, 3275704860,
>> >>> >>> >> >>> 3275704861,
>> >>> >>> >> >>> 3275764577, 3275797149, 3275798566, 3275798567,
>> >>> >>> >> >>> 3275798568, 3275798592, 3275931147, 3275942728, 3275945337,
>> >>> >>> >> >>> 3275945338,
>> >>> >>> >> >>> 3275945339, 3275945340, 3275945342, 3275945344,
>> >>> >>> >> >>> 3275946319, 3275946322, 3275946324, 3275946643, 3275949495,
>> >>> >>> >> >>> 3275949498,
>> >>> >>> >> >>> 3275949500, 3275950250, 3275955128, 3275955129,
>> >>> >>> >> >>> 3275955130, 3427017435, 3427017450, 3438304254, 3438304257,
>> >>> >>> >> >>> 3447068169,
>> >>> >>> >> >>> 3505227849, 3505227890, 3505556908, 3506351285,
>> >>> >>> >> >>> 3506351389, 3506351398, 3506351468, 3510037138, 3510038610,
>> >>> >>> >> >>> 3545590644,
>> >>> >>> >> >>> 3545594378, 3545595073, 3545595318, 3545595506,
>> >>> >>> >> >>> 3545597841, 3545598818, 3545599658, 3545599663, 3545601215,
>> >>> >>> >> >>> 3556080898,
>> >>> >>> >> >>> 3556080980, 3556080999, 3556081323, 3565122663,
>> >>> >>> >> >>> 3565122679, 3565122801, 3565122858, 3565122908, 3565122929,
>> >>> >>> >> >>> 3565122952,
>> >>> >>> >> >>> 3565122984, 3565123028, 3565123047, 3565123048,
>> >>> >>> >> >>> 3565123203, 3565123230, 3949988054, 3949988056, 3949988070,
>> >>> >>> >> >>> 3972992248,
>> >>> >>> >> >>> 3972992252, 3972992254, 3972992257, 3972992263,
>> >>> >>> >> >>> 3972992267, 3972992268, 3972992269, 3972992270, 3972992274,
>> >>> >>> >> >>> 3972992275,
>> >>> >>> >> >>> 3972992277, 3972992281, 3972992293, 3972992298,
>> >>> >>> >> >>> 3972992299, 3972992305, 3972992307, 3972992313, 3972992316,
>> >>> >>> >> >>> 3972992322,
>> >>> >>> >> >>> 3972992338, 3978471261, 3978471272, 4266318185,
>> >>> >>> >> >>> 4298107404, 4308853119, 4308853123, 4308853500, 4451174646,
>> >>> >>> >> >>> 4451174656,
>> >>> >>> >> >>> 4451174701, 4569827278, 4569827284, 4569827287,
>> >>> >>> >> >>> 4569827379, 4569827523, 4569827524, 4896589676, 4979049725,
>> >>> >>> >> >>> 5054587609,
>> >>> >>> >> >>> 5136433884, 5362640372, 5393109964, 5393405364,
>> >>> >>> >> >>> 5393405365, 5393405620, 5393405625, 5393405675, 5393405677,
>> >>> >>> >> >>> 5393405858,
>> >>> >>> >> >>> 5393405970)
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> Reading your interpretation of the skip scan, I see that
>> >>> >>> >> >>> the
>> >>> >>> >> >>> plan
>> >>> >>> >> >>> is
>> >>> >>> >> >>> indicating it is only using the salt and the first three
>> >>> >>> >> >>> columns
>> >>> >>> >> >>> of
>> >>> >>> >> >>> the
>> >>> >>> >> >>> index, client_id, and time_id and conversion_type.  I
>> >>> >>> >> >>> hadn't
>> >>> >>> >> >>> considered the
>> >>> >>> >> >>> salt - that bit of detail in the plan makes more sense to
>> >>> >>> >> >>> me
>> >>> >>> >> >>> now.
>> >>> >>> >> >>> It
>> >>> >>> >> >>> looks
>> >>> >>> >> >>> now like the lackluster performance for higher cardinality
>> >>> >>> >> >>> aggregations is
>> >>> >>> >> >>> related to scanning a much larger portion of the key space.
>> >>> >>> >> >>> For
>> >>> >>> >> >>> aggregations where I am not relying on filtering, I am
>> >>> >>> >> >>> seeing
>> >>> >>> >> >>> much
>> >>> >>> >> >>> better
>> >>> >>> >> >>> performance.
>> >>> >>> >> >>>
>> >>> >>> >> >>> So to tune this particular stitch case / skip scan, it
>> >>> >>> >> >>> looks
>> >>> >>> >> >>> like
>> >>> >>> >> >>> I
>> >>> >>> >> >>> need to
>> >>> >>> >> >>> get the 4th index column into the criteria.  There are only
>> >>> >>> >> >>> four
>> >>> >>> >> >>> distinct
>> >>> >>> >> >>> values in the fourth index column (these can/should
>> >>> >>> >> >>> probably be
>> >>> >>> >> >>> something
>> >>> >>> >> >>> other than varchar, but this is what I have loaded
>> >>> >>> >> >>> currently).
>> >>> >>> >> >>> In
>> >>> >>> >> >>> order to
>> >>> >>> >> >>> use the keyword_id portion of the index I tried explicitly
>> >>> >>> >> >>> specifying
>> >>> >>> >> >>> all
>> >>> >>> >> >>> device_types via in-list (the commented portion of the
>> >>> >>> >> >>> query
>> >>> >>> >> >>> above),
>> >>> >>> >> >>> but I
>> >>> >>> >> >>> get a peculiar error:
>> >>> >>> >> >>>
>> >>> >>> >> >>> java.lang.IndexOutOfBoundsException: end index (1) must not
>> >>> >>> >> >>> be
>> >>> >>> >> >>> less
>> >>> >>> >> >>> than
>> >>> >>> >> >>> start index (2)
>> >>> >>> >> >>> at
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> com.google.common.base.Preconditions.checkPositionIndexes(Preconditions.java:388)
>> >>> >>> >> >>> at
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> com.google.common.collect.ImmutableList.subList(ImmutableList.java:362)
>> >>> >>> >> >>> at
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> com.google.common.collect.ImmutableList.subList(ImmutableList.java:62)
>> >>> >>> >> >>> at
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> org.apache.phoenix.filter.SkipScanFilter.intersect(SkipScanFilter.java:291)
>> >>> >>> >> >>> at
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> org.apache.phoenix.filter.SkipScanFilter.intersect(SkipScanFilter.java:177)
>> >>> >>> >> >>> at
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> org.apache.phoenix.compile.ScanRanges.intersectScan(ScanRanges.java:316)
>> >>> >>> >> >>> at
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:464)
>> >>> >>> >> >>> at
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:394)
>> >>> >>> >> >>> at
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> org.apache.phoenix.iterate.BaseResultIterators.<init>(BaseResultIterators.java:184)
>> >>> >>> >> >>> at
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> org.apache.phoenix.iterate.ParallelIterators.<init>(ParallelIterators.java:54)
>> >>> >>> >> >>> at
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> org.apache.phoenix.execute.AggregatePlan.newIterator(AggregatePlan.java:173)
>> >>> >>> >> >>> at
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:227)
>> >>> >>> >> >>> at
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:154)
>> >>> >>> >> >>> at
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:226)
>> >>> >>> >> >>> at
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:217)
>> >>> >>> >> >>> at
>> >>> >>> >> >>> org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>> >>> >>> >> >>> at
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:216)
>> >>> >>> >> >>> at
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1057)
>> >>> >>> >> >>> at sqlline.SqlLine$Commands.execute(SqlLine.java:3673)
>> >>> >>> >> >>> at sqlline.SqlLine$Commands.sql(SqlLine.java:3584)
>> >>> >>> >> >>> at sqlline.SqlLine.dispatch(SqlLine.java:821)
>> >>> >>> >> >>> at sqlline.SqlLine.begin(SqlLine.java:699)
>> >>> >>> >> >>> at
>> >>> >>> >> >>> sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441)
>> >>> >>> >> >>> at sqlline.SqlLine.main(SqlLine.java:424)
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>> I thought perhaps I was hitting an upper limit on the
>> >>> >>> >> >>> number of
>> >>> >>> >> >>> elements in
>> >>> >>> >> >>> an in-list for a skip scan, and so tried removing the 250
>> >>> >>> >> >>> element
>> >>> >>> >> >>> keyword
>> >>> >>> >> >>> in-list entirely and leaving only the device_type in-list,
>> >>> >>> >> >>> but
>> >>> >>> >> >>> I
>> >>> >>> >> >>> still
>> >>> >>> >> >>> get
>> >>> >>> >> >>> the same error.  It happens immediately, even for an
>> >>> >>> >> >>> explain,
>> >>> >>> >> >>> so I
>> >>> >>> >> >>> presume
>> >>> >>> >> >>> this is a query parsing problem.  Is there a bug or
>> >>> >>> >> >>> limitation
>> >>> >>> >> >>> of
>> >>> >>> >> >>> skip
>> >>> >>> >> >>> scans
>> >>> >>> >> >>> and/or sub lists involving varchar?
>> >>> >>> >> >>>
>> >>> >>> >> >>> Thx
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >> >>>
>> >>> >>> >
>> >>> >>> >
>> >>> >>
>> >>> >>
>> >>> >
>> >>
>> >>
>
>

Re: high cardinality aggregation query performance

Posted by Gary Schulte <gs...@marinsoftware.com>.
I patched using the original (not simplified) patch against 4.3.1 and it
appears to have fixed the issue.  I updated stats and waited for it to
complete and can no longer reproduce the problem.  I will give the new
patch a whirl also just for grins.

Thanks,

Gary

On Sat, Feb 28, 2015 at 7:57 PM, James Taylor <ja...@apache.org>
wrote:

> Gary,
> I've got a patch available on PHOENIX-1690 that fixes the issue for my
> tests. Would you mind giving it a whirl?
> Thanks,
> James
>
> On Fri, Feb 27, 2015 at 6:40 PM, James Taylor <ja...@apache.org>
> wrote:
> > Thanks, Gary. That should be enough for me to repro (though it's a lot
> > of data!).
> >
> > I've always had to hack up the hbase shell script for remote debugging
> > and then it seems to work.
> >
> > On Fri, Feb 27, 2015 at 6:37 PM, Gary Schulte
> > <gs...@marinsoftware.com> wrote:
> >> 509 guideposts according to system.stats, getting the table via runtime
> >> seems to work, guide posts, here: http://goo.gl/jvcFec
> >>
> >>
> >> As an aside, I am having issues getting a connection to phoenix/hbase
> >> remotely (so I can debug from my IDE).  I have all the ports open that I
> >> think would play a part - am I missing anything?
> >> 2181,49255,60000,60010,60020,60030,8080,8085,9090, and 9095.
> Connections
> >> from remote just hang and I never get an error or a stack trace.
> >>
> >> Thx
> >>
> >> -Gary
> >>
> >>
> >> On Fri, Feb 27, 2015 at 5:53 PM, James Taylor <ja...@apache.org>
> >> wrote:
> >>>
> >>> Try this code snippet to see if we can force the stats to be send over:
> >>>
> >>> conn.unwrap(PhoenixConnection.class).getQueryServices().clearCache();
> >>> PTable table = PhoenixRuntime.getTable(conn, "PERF.BIG_OLAP_DOC");
> >>> for (GuidePostsInfo info :
> table.getTableStats().getGuidePosts().values())
> >>> {
> >>>     for (byte[] gp : info.getGuidePosts()) {
> >>>         System.out.println(Bytes.toStringBinary(gp));
> >>>     }
> >>> }
> >>>
> >>> Also, try this query and let me know what it says:
> >>>     SELECT sum(GUIDE_POSTS_COUNT)
> >>>     FROM SYSTEM.STATS
> >>>     WHERE PHYSICAL_NAME = "PERF.BIG_OLAP_DOC";
> >>>
> >>> The UPDATE STATISTICS command timing out on the client prevented the
> >>> client-side to pull over the new stats until it was complete on the
> >>> server-side (that's why you only saw it later).
> >>>
> >>> Thanks,
> >>> James
> >>>
> >>> On Fri, Feb 27, 2015 at 5:42 PM, Gary Schulte
> >>> <gs...@marinsoftware.com> wrote:
> >>> > It appears I spoke too soon.  Presumably once the stats completed
> >>> > updating,
> >>> > I now get the same exception:
> >>> >
> >>> > java.lang.IndexOutOfBoundsException: end index (174) must not be less
> >>> > than
> >>> > start index (226)
> >>> > at
> >>> >
> >>> >
> com.google.common.base.Preconditions.checkPositionIndexes(Preconditions.java:388)
> >>> > at
> >>> >
> com.google.common.collect.ImmutableList.subList(ImmutableList.java:362)
> >>> > at
> >>> >
> com.google.common.collect.ImmutableList.subList(ImmutableList.java:62)
> >>> > at
> >>> >
> >>> >
> org.apache.phoenix.filter.SkipScanFilter.intersect(SkipScanFilter.java:291)
> >>> > at
> >>> >
> >>> >
> org.apache.phoenix.filter.SkipScanFilter.intersect(SkipScanFilter.java:177)
> >>> > at
> >>> >
> org.apache.phoenix.compile.ScanRanges.intersectScan(ScanRanges.java:316)
> >>> > at
> >>> >
> >>> >
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:464)
> >>> > at
> >>> >
> >>> >
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:394)
> >>> > at
> >>> >
> >>> >
> org.apache.phoenix.iterate.BaseResultIterators.<init>(BaseResultIterators.java:184)
> >>> > at
> >>> >
> >>> >
> org.apache.phoenix.iterate.ParallelIterators.<init>(ParallelIterators.java:54)
> >>> > at
> >>> >
> >>> >
> org.apache.phoenix.execute.AggregatePlan.newIterator(AggregatePlan.java:173)
> >>> > at
> >>> >
> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:227)
> >>> > at
> >>> >
> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:154)
> >>> > at
> >>> >
> >>> >
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:226)
> >>> > at
> >>> >
> >>> >
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:217)
> >>> > at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
> >>> > at
> >>> >
> >>> >
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:216)
> >>> > at
> >>> >
> >>> >
> org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1057)
> >>> > at sqlline.SqlLine$Commands.execute(SqlLine.java:3673)
> >>> > at sqlline.SqlLine$Commands.sql(SqlLine.java:3584)
> >>> > at sqlline.SqlLine.dispatch(SqlLine.java:821)
> >>> > at sqlline.SqlLine.begin(SqlLine.java:699)
> >>> > at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441)
> >>> > at sqlline.SqlLine.main(SqlLine.java:424)
> >>> >
> >>> >
> >>> > Still null for the guideposts though, both with PName.EMPTY_NAME and
> >>> > null
> >>> > for the PTableKey constructor.
> >>> >
> >>> > It certainly appears to be stats related.
> >>> >
> >>> > -Gary
> >>> >
> >>> > On Fri, Feb 27, 2015 at 4:06 PM, Gary Schulte
> >>> > <gs...@marinsoftware.com>
> >>> > wrote:
> >>> >>
> >>> >> I have the query timeout set too low, but I believe the stats update
> >>> >> completed as I see related rows in the stats table.
> >>> >>
> >>> >> Both skip and in-list queries run fine - no exceptions.  Still null
> for
> >>> >> the guideposts though - is it likely this is due to the timeout in
> the
> >>> >> stats
> >>> >> update?
> >>> >>
> >>> >> -Gary
> >>> >>
> >>> >> On Fri, Feb 27, 2015 at 12:30 PM, James Taylor <
> jamestaylor@apache.org>
> >>> >> wrote:
> >>> >>>
> >>> >>> See inline. Thanks for your help on this one, Gary. It'd be good to
> >>> >>> get to the bottom of it so it doesn't bite you again.
> >>> >>>
> >>> >>> On Fri, Feb 27, 2015 at 11:13 AM, Gary Schulte
> >>> >>> <gs...@marinsoftware.com> wrote:
> >>> >>> > James,
> >>> >>> >
> >>> >>> > When I simply added the skip scan hint, I got the same exception
> >>> >>> > (even
> >>> >>> > with
> >>> >>> > device_type criteria removed) but the indexes in the exception
> >>> >>> > changed.
> >>> >>> > Interesting - I wouldn't have expected adding a skip scan hint
> would
> >>> >>> > have
> >>> >>> > altered the plan, since it was already doing a skip scan.
> >>> >>>
> >>> >>> The hint tells Phoenix to keep using PK columns in the skip scan,
> >>> >>> rather than stopping at the first PK column that isn't being
> filtered
> >>> >>> on. We don't do this by default, because if the cardinality was
> very
> >>> >>> high, we wouldn't want to do this. Since our stats don't yet
> capture
> >>> >>> cardinality, we can't yet automatically do this.
> >>> >>>
> >>> >>> >
> >>> >>> > 1: current region boundaries, linked so as not to clutter the
> list
> >>> >>> > with
> >>> >>> > hex
> >>> >>> > :  http://goo.gl/hFSzYJ
> >>> >>> >
> >>> >>> > 2: table stats/guideposts, it looks like there are/were none.
> The
> >>> >>> > output
> >>> >>> > from the guidepost loop was :
> >>> >>> >
> >>> >>> > null
> >>> >>> >
> >>> >>> > (this was prior to deleting from system.stats)
> >>> >>> >
> >>> >>> >
> >>> >>> > 3: deleting system.stats appears to have resolved the exception
> for
> >>> >>> > both the
> >>> >>> > explicit varchar inlist and the skip_scan hint.  Skip scanning
> the
> >>> >>> > reduced
> >>> >>> > index space yields much faster results, ~5 seconds as opposed to
> 27.
> >>> >>>
> >>> >>> Glad to hear it's faster with the hint forcing the skip scan across
> >>> >>> all your columns.
> >>> >>>
> >>> >>> I can't explain why deleting the stats resolved the issue, though,
> as
> >>> >>> I would have expected (2) to have returned the stats. FYI, it looks
> >>> >>> like the bug is in the code that intersects the guideposts with the
> >>> >>> region boundaries.
> >>> >>>
> >>> >>> >
> >>> >>> > Should I expect to rebuild stats often or is this more of an
> error
> >>> >>> > case?
> >>> >>>
> >>> >>> Stats are rebuilt automatically when a major compaction occurs and
> are
> >>> >>> updated as splits happen. They can also be manually updated by
> running
> >>> >>> the following command:
> >>> >>>
> >>> >>> UPDATE STATISTICS PERF.BIG_OLAP_DOC
> >>> >>>
> >>> >>> For more info on stats, see
> >>> >>> http://phoenix.apache.org/update_statistics.html
> >>> >>>
> >>> >>> If you run this command, does the problem start to reoccur? If so,
> >>> >>> would you mind adding this command before running the loop to
> collect
> >>> >>> the guideposts and let me know if you see that stats output?
> >>> >>>
> >>> >>> >
> >>> >>> > Thanks again.
> >>> >>> >
> >>> >>> >
> >>> >>> >
> >>> >>> > On Thu, Feb 26, 2015 at 5:55 PM, James Taylor
> >>> >>> > <ja...@apache.org>
> >>> >>> > wrote:
> >>> >>> >>
> >>> >>> >> Gary,
> >>> >>> >> I'm not able to repro the issue - I filed PHOENIX-1690 to track
> it
> >>> >>> >> and
> >>> >>> >> attached my test case there. It looks related to the particular
> >>> >>> >> state
> >>> >>> >> the table is in wrt its region boundaries and current
> statistics,
> >>> >>> >> so
> >>> >>> >> I'll need the following additional information to try to help me
> >>> >>> >> repro
> >>> >>> >> this:
> >>> >>> >>
> >>> >>> >> 1) What are the current region boundaries of your table? You can
> >>> >>> >> get
> >>> >>> >> this programmatically through code like this:
> >>> >>> >>
> >>> >>> >>         Connection conn = DriverManager.getConnection(getUrl(),
> >>> >>> >> props);
> >>> >>> >>         List<HRegionLocation> splits =
> >>> >>> >>
> >>> >>> >>
> >>> >>> >>
> >>> >>> >>
> conn.unwrap(PhoenixConnection.class).getQueryServices().getAllTableRegions(Bytes.toBytes("PERF.BIG_OLAP_DOC"));
> >>> >>> >>         for (HRegionLocation split : splits) {
> >>> >>> >>
> >>> >>> >>
> >>> >>> >>
> >>> >>> >>
> System.out.println(Bytes.toStringBinary(split.getRegionInfo().getEndKey()));
> >>> >>> >>         }
> >>> >>> >>
> >>> >>> >> 2) What are the current stats for the table. You can get this by
> >>> >>> >> programmatically through code like this:
> >>> >>> >>
> >>> >>> >>         PTable table =
> >>> >>> >>
> >>> >>> >>
> conn.unwrap(PhoenixConnection.class).getMetaDataCache().getTable(new
> >>> >>> >> PTableKey(null, "PERF.BIG_OLAP_DOC"));
> >>> >>> >>         for (GuidePostsInfo info :
> >>> >>> >> table.getTableStats().getGuidePosts().values()) {
> >>> >>> >>             for (byte[] gp : info.getGuidePosts()) {
> >>> >>> >>                 System.out.println(Bytes.toStringBinary(gp));
> >>> >>> >>             }
> >>> >>> >>         }
> >>> >>> >>
> >>> >>> >> 3) If you can try after removing all rows from the SYSTEM.STATS
> >>> >>> >> table
> >>> >>> >> and let me know if the problem still occurs, that'd be helpful
> too.
> >>> >>> >> You can just do the following from sqlline: DELETE FROM
> >>> >>> >> SYSTEM.STATS
> >>> >>> >> and then exit sqlline, start it again, and rerun the original
> >>> >>> >> query.
> >>> >>> >>
> >>> >>> >> Thanks,
> >>> >>> >> James
> >>> >>> >>
> >>> >>> >> On Thu, Feb 26, 2015 at 10:52 AM, James Taylor
> >>> >>> >> <ja...@apache.org>
> >>> >>> >> wrote:
> >>> >>> >> > Gary,
> >>> >>> >> > One possible workaround. Can you try adding the SKIP_SCAN
> hint to
> >>> >>> >> > your
> >>> >>> >> > query (instead of the AND device_type in
> >>> >>> >> > ('MOBILE','DESKTOP','OTHER','TABLET')), like this?
> >>> >>> >> >
> >>> >>> >> > SELECT /*+ SKIP_SCAN */ count(1) cnt,
> >>> >>> >> > ...
> >>> >>> >> >
> >>> >>> >> > Thanks,
> >>> >>> >> > James
> >>> >>> >> >
> >>> >>> >> > On Wed, Feb 25, 2015 at 10:16 AM, James Taylor
> >>> >>> >> > <ja...@apache.org>
> >>> >>> >> > wrote:
> >>> >>> >> >> Sounds like a bug. I'll try to repro on my end. Thanks for
> the
> >>> >>> >> >> details,
> >>> >>> >> >> Gary.
> >>> >>> >> >>
> >>> >>> >> >>     James
> >>> >>> >> >>
> >>> >>> >> >> On Tue, Feb 24, 2015 at 1:49 PM, Gary Schulte
> >>> >>> >> >> <gs...@marinsoftware.com> wrote:
> >>> >>> >> >>> On Tue, Feb 24, 2015 at 12:29 AM, James Taylor
> >>> >>> >> >>> <ja...@apache.org>
> >>> >>> >> >>> wrote:
> >>> >>> >> >>>>
> >>> >>> >> >>>> Based on your query plan, the skip scan is being done
> solely
> >>> >>> >> >>>> based on
> >>> >>> >> >>>> your
> >>> >>> >> >>>> salt bucket while the rest of the filtering is being done
> by a
> >>> >>> >> >>>> filter, which
> >>> >>> >> >>>> means that you're not filtering based on the leading part
> of
> >>> >>> >> >>>> your
> >>> >>> >> >>>> primary
> >>> >>> >> >>>> key. We'll know more once you post your schema, but if
> >>> >>> >> >>>> NETWORK,
> >>> >>> >> >>>> KEYWORD_ID
> >>> >>> >> >>>> and CUSTOMER_ID formed your primary key constraint, then
> the
> >>> >>> >> >>>> skip
> >>> >>> >> >>>> scan would
> >>> >>> >> >>>> work well.
> >>> >>> >> >>>>
> >>> >>> >> >>>
> >>> >>> >> >>> Thanks for your response James.  Sorry for the slow reply -
> I
> >>> >>> >> >>> had
> >>> >>> >> >>> difficulty
> >>> >>> >> >>> finding the exact set of test queries I was using for
> timings.
> >>> >>> >> >>>
> >>> >>> >> >>> The relevant portion of the olap doc schema is:
> >>> >>> >> >>>
> >>> >>> >> >>> create table PERF.BIG_OLAP_DOC (
> >>> >>> >> >>> client_id                           integer not null
> >>> >>> >> >>> ,customer_id                        integer
> >>> >>> >> >>> ,time_id                            integer not null
> >>> >>> >> >>> ,conversion_type_id                 integer not null
> >>> >>> >> >>> ,device_type                        varchar(16)
> >>> >>> >> >>> ,keyword_id                         bigint not null
> >>> >>> >> >>> ,creative_id                        bigint not null
> >>> >>> >> >>> ,placement_id                       bigint not null
> >>> >>> >> >>> ,product_target_id                  bigint not null
> >>> >>> >> >>> ,network                            varchar(7)
> >>> >>> >> >>> ,impressions                        decimal(18, 4)
> >>> >>> >> >>> ,publisher_clicks                   decimal(18, 4)
> >>> >>> >> >>> ,publisher_cost                     decimal(18, 4)
> >>> >>> >> >>> ,conversions                        decimal(18, 4)
> >>> >>> >> >>> ,revenue                            decimal(18, 4)
> >>> >>> >> >>>
> >>> >>> >> >>> [ ...additional metric and dimensional colums ... ]
> >>> >>> >> >>>
> >>> >>> >> >>>     constraint perf_fact_pk primary key (client_id, time_id,
> >>> >>> >> >>> conversion_type_id, device_type, keyword_id, creative_id,
> >>> >>> >> >>> placement_id,
> >>> >>> >> >>> product_target_id))SALT_BUCKETS=10;
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>> I am evaluating a 'stitch' case where results from an
> external
> >>> >>> >> >>> system
> >>> >>> >> >>> are
> >>> >>> >> >>> injected either via table or (as in this case) an in-list.
> An
> >>> >>> >> >>> example
> >>> >>> >> >>> of
> >>> >>> >> >>> one of these test agg queries I am using is:
> >>> >>> >> >>>
> >>> >>> >> >>> SELECT count(1) cnt,
> >>> >>> >> >>>        coalesce(SUM(impressions), 0.0) AS "impressions",
> >>> >>> >> >>>        coalesce(SUM(publisher_clicks), 0.0) AS "pub_clicks",
> >>> >>> >> >>>        coalesce(SUM(publisher_cost), 0.0) AS "pub_cost",
> >>> >>> >> >>>        coalesce(SUM(conversions), 0.0) AS "conversions",
> >>> >>> >> >>>        coalesce(SUM(revenue), 0.0) AS "revenue"
> >>> >>> >> >>>   FROM perf.big_olap_doc
> >>> >>> >> >>>  WHERE time_id between 3000 and 3700
> >>> >>> >> >>>    AND network in ('SEARCH')
> >>> >>> >> >>>    AND conversion_type_id = 1
> >>> >>> >> >>>    AND client_id = 10724
> >>> >>> >> >>> --   AND device_type in
> ('MOBILE','DESKTOP','OTHER','TABLET')
> >>> >>> >> >>>    AND keyword_id in (
> >>> >>> >> >>> 613214369, 613217307, 613247509, 613248897, 613250382,
> >>> >>> >> >>> 613250387,
> >>> >>> >> >>> 613252322,
> >>> >>> >> >>> 613260252, 613261753, 613261754, 613261759,
> >>> >>> >> >>> 613261770, 613261873, 613261884, 613261885, 613261888,
> >>> >>> >> >>> 613261889,
> >>> >>> >> >>> 613261892,
> >>> >>> >> >>> 613261897, 613261913, 613261919, 613261927,
> >>> >>> >> >>> 614496021, 843606367, 843606967, 843607021, 843607033,
> >>> >>> >> >>> 843607089,
> >>> >>> >> >>> 1038731600, 1038731672, 1038731673, 1038731675,
> >>> >>> >> >>> 1038731684, 1038731693, 1046990487, 1046990488, 1046990499,
> >>> >>> >> >>> 1046990505,
> >>> >>> >> >>> 1046990506, 1049724722, 1051109548, 1051311275,
> >>> >>> >> >>> 1051311904, 1060574377, 1060574395, 1060574506, 1060574562,
> >>> >>> >> >>> 1115915938,
> >>> >>> >> >>> 1115915939, 1115915941, 1116310571, 1367495544,
> >>> >>> >> >>> 1367495545, 1367497297, 1367497298, 1367497299, 1367497300,
> >>> >>> >> >>> 1367497303,
> >>> >>> >> >>> 1367497313, 1367497813, 1367497816, 1367497818,
> >>> >>> >> >>> 1367497821, 1367497822, 1367497823, 1624976423, 1624976451,
> >>> >>> >> >>> 1624976457,
> >>> >>> >> >>> 3275636061, 3275640505, 3275645765, 3275645807,
> >>> >>> >> >>> 3275649138, 3275651456, 3275651460, 3275651478, 3275651479,
> >>> >>> >> >>> 3275654566,
> >>> >>> >> >>> 3275654568, 3275654570, 3275654575, 3275659612,
> >>> >>> >> >>> 3275659616, 3275659620, 3275668880, 3275669693, 3275675627,
> >>> >>> >> >>> 3275675634,
> >>> >>> >> >>> 3275677479, 3275677504, 3275678855, 3275679524,
> >>> >>> >> >>> 3275679532, 3275680014, 3275682307, 3275682308, 3275682309,
> >>> >>> >> >>> 3275682310,
> >>> >>> >> >>> 3275682420, 3275682423, 3275682436, 3275682448,
> >>> >>> >> >>> 3275682460, 3275682462, 3275682474, 3275684831, 3275688903,
> >>> >>> >> >>> 3275694023,
> >>> >>> >> >>> 3275694025, 3275694027, 3275695054, 3275695056,
> >>> >>> >> >>> 3275695062, 3275699512, 3275699514, 3275699518, 3275701682,
> >>> >>> >> >>> 3275701683,
> >>> >>> >> >>> 3275701685, 3275701688, 3275703633, 3275703634,
> >>> >>> >> >>> 3275703635, 3275703636, 3275703638, 3275703639, 3275704860,
> >>> >>> >> >>> 3275704861,
> >>> >>> >> >>> 3275764577, 3275797149, 3275798566, 3275798567,
> >>> >>> >> >>> 3275798568, 3275798592, 3275931147, 3275942728, 3275945337,
> >>> >>> >> >>> 3275945338,
> >>> >>> >> >>> 3275945339, 3275945340, 3275945342, 3275945344,
> >>> >>> >> >>> 3275946319, 3275946322, 3275946324, 3275946643, 3275949495,
> >>> >>> >> >>> 3275949498,
> >>> >>> >> >>> 3275949500, 3275950250, 3275955128, 3275955129,
> >>> >>> >> >>> 3275955130, 3427017435, 3427017450, 3438304254, 3438304257,
> >>> >>> >> >>> 3447068169,
> >>> >>> >> >>> 3505227849, 3505227890, 3505556908, 3506351285,
> >>> >>> >> >>> 3506351389, 3506351398, 3506351468, 3510037138, 3510038610,
> >>> >>> >> >>> 3545590644,
> >>> >>> >> >>> 3545594378, 3545595073, 3545595318, 3545595506,
> >>> >>> >> >>> 3545597841, 3545598818, 3545599658, 3545599663, 3545601215,
> >>> >>> >> >>> 3556080898,
> >>> >>> >> >>> 3556080980, 3556080999, 3556081323, 3565122663,
> >>> >>> >> >>> 3565122679, 3565122801, 3565122858, 3565122908, 3565122929,
> >>> >>> >> >>> 3565122952,
> >>> >>> >> >>> 3565122984, 3565123028, 3565123047, 3565123048,
> >>> >>> >> >>> 3565123203, 3565123230, 3949988054, 3949988056, 3949988070,
> >>> >>> >> >>> 3972992248,
> >>> >>> >> >>> 3972992252, 3972992254, 3972992257, 3972992263,
> >>> >>> >> >>> 3972992267, 3972992268, 3972992269, 3972992270, 3972992274,
> >>> >>> >> >>> 3972992275,
> >>> >>> >> >>> 3972992277, 3972992281, 3972992293, 3972992298,
> >>> >>> >> >>> 3972992299, 3972992305, 3972992307, 3972992313, 3972992316,
> >>> >>> >> >>> 3972992322,
> >>> >>> >> >>> 3972992338, 3978471261, 3978471272, 4266318185,
> >>> >>> >> >>> 4298107404, 4308853119, 4308853123, 4308853500, 4451174646,
> >>> >>> >> >>> 4451174656,
> >>> >>> >> >>> 4451174701, 4569827278, 4569827284, 4569827287,
> >>> >>> >> >>> 4569827379, 4569827523, 4569827524, 4896589676, 4979049725,
> >>> >>> >> >>> 5054587609,
> >>> >>> >> >>> 5136433884, 5362640372, 5393109964, 5393405364,
> >>> >>> >> >>> 5393405365, 5393405620, 5393405625, 5393405675, 5393405677,
> >>> >>> >> >>> 5393405858,
> >>> >>> >> >>> 5393405970)
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>> Reading your interpretation of the skip scan, I see that the
> >>> >>> >> >>> plan
> >>> >>> >> >>> is
> >>> >>> >> >>> indicating it is only using the salt and the first three
> >>> >>> >> >>> columns
> >>> >>> >> >>> of
> >>> >>> >> >>> the
> >>> >>> >> >>> index, client_id, and time_id and conversion_type.  I hadn't
> >>> >>> >> >>> considered the
> >>> >>> >> >>> salt - that bit of detail in the plan makes more sense to me
> >>> >>> >> >>> now.
> >>> >>> >> >>> It
> >>> >>> >> >>> looks
> >>> >>> >> >>> now like the lackluster performance for higher cardinality
> >>> >>> >> >>> aggregations is
> >>> >>> >> >>> related to scanning a much larger portion of the key space.
> >>> >>> >> >>> For
> >>> >>> >> >>> aggregations where I am not relying on filtering, I am
> seeing
> >>> >>> >> >>> much
> >>> >>> >> >>> better
> >>> >>> >> >>> performance.
> >>> >>> >> >>>
> >>> >>> >> >>> So to tune this particular stitch case / skip scan, it looks
> >>> >>> >> >>> like
> >>> >>> >> >>> I
> >>> >>> >> >>> need to
> >>> >>> >> >>> get the 4th index column into the criteria.  There are only
> >>> >>> >> >>> four
> >>> >>> >> >>> distinct
> >>> >>> >> >>> values in the fourth index column (these can/should
> probably be
> >>> >>> >> >>> something
> >>> >>> >> >>> other than varchar, but this is what I have loaded
> currently).
> >>> >>> >> >>> In
> >>> >>> >> >>> order to
> >>> >>> >> >>> use the keyword_id portion of the index I tried explicitly
> >>> >>> >> >>> specifying
> >>> >>> >> >>> all
> >>> >>> >> >>> device_types via in-list (the commented portion of the query
> >>> >>> >> >>> above),
> >>> >>> >> >>> but I
> >>> >>> >> >>> get a peculiar error:
> >>> >>> >> >>>
> >>> >>> >> >>> java.lang.IndexOutOfBoundsException: end index (1) must not
> be
> >>> >>> >> >>> less
> >>> >>> >> >>> than
> >>> >>> >> >>> start index (2)
> >>> >>> >> >>> at
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> com.google.common.base.Preconditions.checkPositionIndexes(Preconditions.java:388)
> >>> >>> >> >>> at
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> com.google.common.collect.ImmutableList.subList(ImmutableList.java:362)
> >>> >>> >> >>> at
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> com.google.common.collect.ImmutableList.subList(ImmutableList.java:62)
> >>> >>> >> >>> at
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> org.apache.phoenix.filter.SkipScanFilter.intersect(SkipScanFilter.java:291)
> >>> >>> >> >>> at
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> org.apache.phoenix.filter.SkipScanFilter.intersect(SkipScanFilter.java:177)
> >>> >>> >> >>> at
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> org.apache.phoenix.compile.ScanRanges.intersectScan(ScanRanges.java:316)
> >>> >>> >> >>> at
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:464)
> >>> >>> >> >>> at
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:394)
> >>> >>> >> >>> at
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> org.apache.phoenix.iterate.BaseResultIterators.<init>(BaseResultIterators.java:184)
> >>> >>> >> >>> at
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> org.apache.phoenix.iterate.ParallelIterators.<init>(ParallelIterators.java:54)
> >>> >>> >> >>> at
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> org.apache.phoenix.execute.AggregatePlan.newIterator(AggregatePlan.java:173)
> >>> >>> >> >>> at
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:227)
> >>> >>> >> >>> at
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:154)
> >>> >>> >> >>> at
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:226)
> >>> >>> >> >>> at
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:217)
> >>> >>> >> >>> at
> org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
> >>> >>> >> >>> at
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:216)
> >>> >>> >> >>> at
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1057)
> >>> >>> >> >>> at sqlline.SqlLine$Commands.execute(SqlLine.java:3673)
> >>> >>> >> >>> at sqlline.SqlLine$Commands.sql(SqlLine.java:3584)
> >>> >>> >> >>> at sqlline.SqlLine.dispatch(SqlLine.java:821)
> >>> >>> >> >>> at sqlline.SqlLine.begin(SqlLine.java:699)
> >>> >>> >> >>> at
> sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441)
> >>> >>> >> >>> at sqlline.SqlLine.main(SqlLine.java:424)
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>> I thought perhaps I was hitting an upper limit on the
> number of
> >>> >>> >> >>> elements in
> >>> >>> >> >>> an in-list for a skip scan, and so tried removing the 250
> >>> >>> >> >>> element
> >>> >>> >> >>> keyword
> >>> >>> >> >>> in-list entirely and leaving only the device_type in-list,
> but
> >>> >>> >> >>> I
> >>> >>> >> >>> still
> >>> >>> >> >>> get
> >>> >>> >> >>> the same error.  It happens immediately, even for an
> explain,
> >>> >>> >> >>> so I
> >>> >>> >> >>> presume
> >>> >>> >> >>> this is a query parsing problem.  Is there a bug or
> limitation
> >>> >>> >> >>> of
> >>> >>> >> >>> skip
> >>> >>> >> >>> scans
> >>> >>> >> >>> and/or sub lists involving varchar?
> >>> >>> >> >>>
> >>> >>> >> >>> Thx
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >> >>>
> >>> >>> >
> >>> >>> >
> >>> >>
> >>> >>
> >>> >
> >>
> >>
>