You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by GitBox <gi...@apache.org> on 2021/06/23 00:36:52 UTC

[GitHub] [skywalking-banyandb] hanahmily commented on pull request #10: Query Module: Logical plan Pt.

hanahmily commented on pull request #10:
URL: https://github.com/apache/skywalking-banyandb/pull/10#issuecomment-866431221


   > > From your screenshots, the `Scan` is the first operation even fields exist. From the design, `Select` which should be the first one parsed ChunkIDs from `index`, then the results will be passed to `FetchEntity`(I didn't see this step in the parsed plan). The `Scan` will be used only there's no any `Selection` input, which means it will be picked rarely.
   > 
   > As I understood, the logical plan does not care about the indexes, it just prepares the metadata and resolves fields so that the existence of these fields which are referenced can be guaranteed.
   > 
   > As we've discussed in the last PR, we can select the indexes while generating physical plans based on cost-first consideration.
   
   Not that. cost-based optimization only takes place when a field belongs to more than one index. Supposing `service_id` is indexed by `service_id` + `instance_id` and `service_id` + `endpoint_id`, we should leverage cost-based statistics to determine which one should be used. In this case, the combination of `service_id` and `instance_id` has fewer cardinalities, which will get less cost. Based on that, the query physical optimizer should pick up it instead of `service_id` + `endpoint_id`.
   
   As I mentioned in another PR, we don't support the composite index for now. We don't have to implement the above plan optimization. 
   
   >  the logical plan does not care about the indexes
   
   That's the convention of a traditional SQL database due to the complex index type, for example, single, composite, unique, group by and etc. Such a database can't determine query paths without statistics. But the query of BanyanDB doesn't, which knows how to access single fields index based on the criteria. 
   
   We also remove `tags` from query criteria, which's due to the huge cost of `Scan` filtering. As I mentioned, the query plan picks up `Scan` when no fields are input. 
   
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org