You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Andrew Kyle Purtell (Jira)" <ji...@apache.org> on 2022/06/11 18:38:00 UTC

[jira] [Resolved] (HBASE-17756) We should have better introspection of HFiles

     [ https://issues.apache.org/jira/browse/HBASE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Kyle Purtell resolved HBASE-17756.
-----------------------------------------
    Resolution: Implemented

> We should have better introspection of HFiles
> ---------------------------------------------
>
>                 Key: HBASE-17756
>                 URL: https://issues.apache.org/jira/browse/HBASE-17756
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: HFile
>            Reporter: Esteban Gutierrez
>            Assignee: Rushabh Shah
>            Priority: Major
>         Attachments: 0001-HBASE-17756-We-should-have-better-introspection-of-H.patch
>
>
> [~saint.ack@gmail.com] was suggesting to use DataSketches (https://datasketches.github.io) in order to write additional statistics to the HFiles. This could be used to improve our split decisions, troubleshooting or potentially do other interesting analysis without having to perform full table scans. The statistics could be stored as part of the HFile but we could initially improve the visibility of the data by adding some statistics to HFilePrettyPrinter.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)