You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@carbondata.apache.org by "Venugopal Reddy K (Jira)" <ji...@apache.org> on 2020/05/22 05:29:00 UTC

[jira] [Created] (CARBONDATA-3832) Block Pruning for geospatial polygon expression

Venugopal Reddy K created CARBONDATA-3832:
---------------------------------------------

             Summary: Block Pruning for geospatial polygon expression
                 Key: CARBONDATA-3832
                 URL: https://issues.apache.org/jira/browse/CARBONDATA-3832
             Project: CarbonData
          Issue Type: Improvement
    Affects Versions: 2.0.0
            Reporter: Venugopal Reddy K


*[Issue]*

At present, carbon doesn't do block/blocklet pruning for polygon fileter queries. It does rowlevel filtering at carbon layer and returns result. With this approach, all the carbon files are scanned irrespective of the where there are any matching rows in the block. It also has spark overhead to launch many jobs and tasks to process them. Thus affects the overall performance of polygon query.

 

*[Solution]*

We can leverage the existing block pruning mechanism in the carbon and avoid the unwanted blocks with block pruning. Thus reduce the number of splits. And at the executor side,  we can also use blocklet pruning and reduce the number of blocklets to be read and scanned.

Thus improves the polygon query performace.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)