You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by GitBox <gi...@apache.org> on 2021/06/29 17:56:34 UTC
[GitHub] [phoenix] kadirozde commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

kadirozde commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-870799645


   > @kadirozde @lhofhansl FYI.
   > 
   > 1.You said "Phoenix client does not use a global index for the queries with the columns that are not covered by the global index" is not right , In QueryOptimizer.addPlan, for the sql with the columns that are not covered by the global index, if user specify a Index Hint and there exists where clause, the sql would be rewritten as
   > "SELECT /*+ NO_INDEX _/ K,V1,V2 FROM T WHERE ("K" IN ((SELECT /_+ INDEX(T IDX) */ ":K" FROM "IDX" WHERE "0:V1" = 'bar')) AND V2 = 'foo') " (k is pk of T , v1 is in IDX and v2 is not), you may consider compatibility with exising code.
   > 
   
   What I meant is that by default the uncovered global index is not used. One can construct a query plan manually using hints as you pointed it out to use the uncovered global index. Please note that you can also construct a SQL join statement and achieve the same thing.  
   
   > 2.Whether or not scaning the gobal index and retrieving the corresponding rows from the data table is better than just scaning the data table is a complex problem, because there are many factors we need to consider such as Network cost, random disk access cost , data distribution , column selective etc. You said "It is expected that such performance improvement will happen when the index row key prefix length is greater than the data row key prefix length for a given query" is extremely insufficient. Lack of a CBO framework in Phoenix, seems that it is sensible to be conservative, I think it is better to left whether or not select this strategy to user by user specifying the Index Hint just as the existing code.
   
   I agree that there is no guarantee that the index always performs better. However, based on my experience, it will perform better in most of the cases in practice.  This is because the index PK is designed by the user who knows the use case (the type and shape of queries) and the user wants that the index should be used if the index row key prefix length is greater than the data row key prefix length for a given query in general. I understand your concern here and please help me out on how to proceed here. I can add a config param to use uncovered indexes without a specific hint.  This mean that we will preserve the existing behavior if the config param is not specified. Would that address your concern?
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org