You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Michael Stack (Jira)" <ji...@apache.org> on 2020/08/19 20:33:00 UTC

[jira] [Comment Edited] (HBASE-24901) Create versatile hbase-shell table formatter

    [ https://issues.apache.org/jira/browse/HBASE-24901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17180806#comment-17180806 ] 

Michael Stack edited comment on HBASE-24901 at 8/19/20, 8:32 PM:
-----------------------------------------------------------------

Lovely. What determines the widths? Is it on by default? Does the width size to the terminal? Does it handle binaries? What happens when value is 100MB?


was (Author: stack):
Lovely. What determines the widths? Is it on by default? Does the width size to the terminal?

> Create versatile hbase-shell table formatter
> --------------------------------------------
>
>                 Key: HBASE-24901
>                 URL: https://issues.apache.org/jira/browse/HBASE-24901
>             Project: HBase
>          Issue Type: Improvement
>          Components: shell
>    Affects Versions: 3.0.0-alpha-1
>            Reporter: Elliot Miller
>            Assignee: Elliot Miller
>            Priority: Major
>         Attachments: HBASE-24901_scan_output_comparison.png
>
>
> As a user, I would like a simple interface for shell output that can be expressed as a table (ie. output with a fixed number of columns and potentially many rows). To be clear, this new formatter is not specifically for HBase "tables." Table is used in the broader sense here.
> h2. Goals
>  - Do not require more than one output cell loaded in memory at a time
>  - Support many implementations like aligned human-friendly tables, unaligned delimited, and JSON
> h2. Non-goals
>  - Don't load all the headers into memory at once.
>  ** This may seem like a goal with merit, but we are unlikely to find a use case for this formatter with many columns. For example: since HBase tables aren't relational, our scan output will not have an output column for every HBase column. Instead, each output row will correspond to an HBase cell.
>  ** It's also really useful to have the headers ahead of time, because it allows us to do things like JSON object output (where each row is represented with key-value pairs).
> h2. Implementation
> This patch was implemented as a stateful output formatter for data with a fixed number of output columns. Tracking state inside the formatter is an important design feature so that we don't have to feed the formatter all the data at once.
> h2. Formatter Usage Pattern
> The verbose way to use the formatter to print a table is as follows:
> 1. call start_table to reset the formatter's state and pass configuration options
> 2. call start_row to start writing a row
> 3. call cell to write a single cell
> 4. call close_row
> 5. call close_table
> Sometimes, it will feel like this is a lot of method calls, but these calls act as "hooks"
> and give each of the formatter implementations a chance to fill out all the content necessary
> between cells. To cut down on boilerplate, there are shortcut methods like row and single_value_table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)