You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Alok Singh (JIRA)" <ji...@apache.org> on 2015/10/05 18:06:26 UTC

[jira] [Created] (PHOENIX-2306) Expose additional statistics in the explain plan to allow better cost estimation

Alok Singh created PHOENIX-2306:
-----------------------------------

             Summary: Expose additional statistics in the explain plan to allow better cost estimation 
                 Key: PHOENIX-2306
                 URL: https://issues.apache.org/jira/browse/PHOENIX-2306
             Project: Phoenix
          Issue Type: New Feature
         Environment: 4.5.1
            Reporter: Alok Singh
            Priority: Minor


In a mailing list converstation, James described the phoenix APIs that can be used to derive cost estimates.
{noformat}
Yes, you could calculate an estimate for this information, but it isn't currently exposed through JDBC or through the explain plan (which would be a good place for it to live). You'd need to dip down to the implementation to get it. Something like this:

PhoenixStatement statement = connection.createStatement().unwrap(PhoenixStatement.class);
ResultSet rs = statement.executeQuery("EXPLAIN SELECT ...");
QueryPlan plan = statement.getQueryPlan();
List<KeyRange> ranges = plan.getSplits();

Each KeyRange in ranges will be going over a configurable amount of bytes (determined by phoenix.stats.guidepost.width and/or phoenix.stats.guidepost.per.region), so a simple worst case estimate would be to multiply the ranges.size() by this config value (using a default of QueryServicesOptions.DEFAULT_STATS_GUIDEPOST_WIDTH_BYTES or 300MB). If the query is a point lookup (which you can check with plan.getContext().getScanRanges().isPointLookup()), then the cost would be ranges.size() * average_row_size.

Since these aren't exposed APIs, they're subject to change. Please file a JIRA if you're interested in helping figure out what the "official" APIs for this should be.
{noformat}

Ideally, these statistics should be returned as part of the explain plan. That would allow the end users of phoenix to use standard JDBC tooling to get at this information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)