You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2019/04/15 17:45:45 UTC

[GitHub] [drill] kkhatua opened a new pull request #1750: DRILL-2362: Profile Mgmt

kkhatua opened a new pull request #1750: DRILL-2362: Profile Mgmt
URL: https://github.com/apache/drill/pull/1750
 
 
   This PR is a WIP for managing a large number of profiles. It involves the following features.
   
   1. Write profiles to indexed partitions (created on the fly, and default being organized in nested directories by year, month and date).
   2. Read chronologically from the above partitioned dirs. This improves performance by scanning and retrieving only from the most recent profiles
   3. Leverage Guava Cache by saving on cost of deserializing a profile multiple times from the disk. (Even 1 attempt at rendering a profile leads to atleast 2 times deserialization).
   4. Infer which partitioned dir has a profile based on queryId alone. This means that rather than scanning all the directories, we reverse engineer the query ID to figure out the approximate start time of the query to narrow down on the profile's location.
   5. Trace Exception [qId: 259432dc-7f8e-8fc5-af69-16a1ca817689 ] -> This is a sample bad profile and make the UI more robust in handling bad profiles that cant be deserialized
   6. Auto Index for 1st time (In batches of 10000) from root dir (sync if Distributed). Using ZK, synchronization is maintained when multiple Drillbits are sharing the same profile location
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services