You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by GitBox <gi...@apache.org> on 2021/07/07 05:01:08 UTC

[GitHub] [phoenix] lhofhansl edited a comment on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

lhofhansl edited a comment on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-875280447


   If you do SELECT count(uncovered_column) FROM T WHERE covered_column = xyz, the global uncovered index is not used even when you hint it as expected (I just verified that current 5.x. Phoenix).
   
   I found that uncovered local indexes (that's what I tested) are sometimes much slower than doing to a full table scan. That happens when there is a WHERE clause that an index could be used for, but the WHERE restriction is not selective.
   
   (As noted above, FAST_DIFF (Phoenix' default) is actually the worst choice since SEEKs are slow with it. ROW_INDEX_V1 with ZSTD compression are far better. I blogged about this here: https://hadoop-hbase.blogspot.com/2018/10/apache-hbase-and-apache-phoenix-more-on.html a while ago: With FAST_DIFF the WHERE clause needed to be **0.5% (return 1/200 of the data)** to be effective. With ROW_INDEX_V1 + ZSTD that was 10%.)
   
   This is at best as good as uncovered local indexing, and probably worse since we need to go remote for each row, unless we do batching. And the batches would still be requiring a SKIP_SCAN, which in the general case is still very slow for FAST_DIFF.
   So I expect with defaults the WHERE clause would need to be somewhere between 1/1000 and 1/300 hundred selective for this to be improvement.
   
   Anyway... I think we should check this in. Presumable folks would create uncovered global indexes only when they know what they are doing.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org