You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Bryan Beaudreault (Jira)" <ji...@apache.org> on 2021/07/28 19:11:00 UTC

[jira] [Comment Edited] (HBASE-26122) Limit max result size of individual Gets

    [ https://issues.apache.org/jira/browse/HBASE-26122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17388971#comment-17388971 ] 

Bryan Beaudreault edited comment on HBASE-26122 at 7/28/21, 7:10 PM:
---------------------------------------------------------------------

Here's a quick example of this working in the hbase shell:

*Create table*

{{hbase:005:0> create 't1', 'f1'}}
 {{Created table t1}}
 {{Took 1.1306 seconds}}

*{{Insert test data}}*

{{hbase:012:0> put 't1', 'r1', 'f1:c1', 'a'}}
 {{Took 0.0416 seconds}}
 {{hbase:014:0> put 't1', 'r1', 'f1:c2', 'b'}}
 {{Took 0.0059 seconds}}
 {{hbase:015:0> put 't1', 'r1', 'f1:c3', 'c'}}
 {{Took 0.0097 seconds}}

*Get without setMaxResultSize, returns full row and {{mayHaveMoreCellsInRow = false}}*

{{hbase:037:0> g = Get.new('r1'.to_s.to_java_bytes)}}
 {{=> #<Java::OrgApacheHadoopHbaseClient::Get:0x11fa11b2>}}
 {{hbase:038:0> result = @hbase.table('t1', @shell).instance_variable_get(:@table).get(g)}}
 {{=> #<Java::OrgApacheHadoopHbaseClient::Result:0x217009bd>}}
 {{hbase:039:0> result.mayHaveMoreCellsInRow}}
 {{=> false}}
 {{hbase:040:0> result.toString}}
 {{=> "keyvalues=\{r1/f1:c1/1627498270850/Put/vlen=1/seqid=0, r1/f1:c2/1627498276326/Put/vlen=1/seqid=0, r1/f1:c3/1627498280413/Put/vlen=1/seqid=0}"}}

*{{Get with setMaxResultSize, returns first two columns and }}{{mayHaveMoreCellsInRow = true}}*

{{hbase:059:0> g = Get.new('r1'.to_s.to_java_bytes).setMaxResultSize(100)}}
 {{=> #<Java::OrgApacheHadoopHbaseClient::Get:0x5ed88e31>}}
 {{hbase:060:0> result = @hbase.table('t1', @shell).instance_variable_get(:@table).get(g)}}
 {{=> #<Java::OrgApacheHadoopHbaseClient::Result:0x574e4184>}}
 {{hbase:061:0> result.mayHaveMoreCellsInRow}}
 {{=> true}}
 {{hbase:062:0> result.toString}}
 {{=> "keyvalues=\{r1/f1:c1/1627498270850/Put/vlen=1/seqid=0, r1/f1:c2/1627498276326/Put/vlen=1/seqid=0}"}}

 


was (Author: bbeaudreault):
Here's a quick example of this working in the hbase shell:

*Create table*

{{hbase:005:0> create 't1', 'f1'}}
 {{Created table t1}}
 {{Took 1.1306 seconds}}

*{{Insert test data}}*

{{hbase:012:0> put 't1', 'r1', 'f1:c1', 'a'}}
 {{Took 0.0416 seconds}}
 {{hbase:014:0> put 't1', 'r1', 'f1:c2', 'b'}}
 {{Took 0.0059 seconds}}
 {{hbase:015:0> put 't1', 'r1', 'f1:c3', 'c'}}
 {{Took 0.0097 seconds}}

*Get without setMaxResultSize, returns full row and {{mayHaveMoreCellsInRow = false}}*{{}}

{{hbase:037:0> g = Get.new('r1'.to_s.to_java_bytes)}}
 {{=> #<Java::OrgApacheHadoopHbaseClient::Get:0x11fa11b2>}}
 {{hbase:038:0> result = @hbase.table('t1', @shell).instance_variable_get(:@table).get(g)}}
 {{=> #<Java::OrgApacheHadoopHbaseClient::Result:0x217009bd>}}
 {{hbase:039:0> result.mayHaveMoreCellsInRow}}
 {{=> false}}
 {{hbase:040:0> result.toString}}
 {{=> "keyvalues=\{r1/f1:c1/1627498270850/Put/vlen=1/seqid=0, r1/f1:c2/1627498276326/Put/vlen=1/seqid=0, r1/f1:c3/1627498280413/Put/vlen=1/seqid=0}"}}

*{{Get with setMaxResultSize, returns first two columns and }}{{mayHaveMoreCellsInRow = true}}*{{}}

{{hbase:059:0> g = Get.new('r1'.to_s.to_java_bytes).setMaxResultSize(100)}}
 {{=> #<Java::OrgApacheHadoopHbaseClient::Get:0x5ed88e31>}}
 {{hbase:060:0> result = @hbase.table('t1', @shell).instance_variable_get(:@table).get(g)}}
 {{=> #<Java::OrgApacheHadoopHbaseClient::Result:0x574e4184>}}
 {{hbase:061:0> result.mayHaveMoreCellsInRow}}
 {{=> true}}
 {{hbase:062:0> result.toString}}
 {{=> "keyvalues=\{r1/f1:c1/1627498270850/Put/vlen=1/seqid=0, r1/f1:c2/1627498276326/Put/vlen=1/seqid=0}"}}

 

> Limit max result size of individual Gets
> ----------------------------------------
>
>                 Key: HBASE-26122
>                 URL: https://issues.apache.org/jira/browse/HBASE-26122
>             Project: HBase
>          Issue Type: New Feature
>          Components: Client, regionserver
>            Reporter: Bryan Beaudreault
>            Assignee: Bryan Beaudreault
>            Priority: Major
>
> Scans have the ability to have a configured max result size, which causes them to return a partial result once the limit has been reached. MultiGets also can throw MultiActionResultTooLarge if the response size is over a configured quota. Neither of these really accounts for a single Get of a too-large row. Such too-large Gets can cause substantial GC pressure or worse if sent at volume.
> Currently one can work around this by converting their Get to a single row Scan, but this requires a developer to proactively know about and prepare for the issue by using a Scan upfront or wait for the RegionServer to choke on a large request and only then rewrite the Get for future requests.
> We should implement the same response size limits for for Get as for Scan, whereby the server returns a partial result to the client for handling.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)