You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Bryan Beaudreault (Jira)" <ji...@apache.org> on 2021/07/28 19:11:00 UTC
[jira] [Comment Edited] (HBASE-26122) Limit max result size of
individual Gets
[ https://issues.apache.org/jira/browse/HBASE-26122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17388971#comment-17388971 ]
Bryan Beaudreault edited comment on HBASE-26122 at 7/28/21, 7:10 PM:
---------------------------------------------------------------------
Here's a quick example of this working in the hbase shell:
*Create table*
{{hbase:005:0> create 't1', 'f1'}}
{{Created table t1}}
{{Took 1.1306 seconds}}
*{{Insert test data}}*
{{hbase:012:0> put 't1', 'r1', 'f1:c1', 'a'}}
{{Took 0.0416 seconds}}
{{hbase:014:0> put 't1', 'r1', 'f1:c2', 'b'}}
{{Took 0.0059 seconds}}
{{hbase:015:0> put 't1', 'r1', 'f1:c3', 'c'}}
{{Took 0.0097 seconds}}
*Get without setMaxResultSize, returns full row and {{mayHaveMoreCellsInRow = false}}*
{{hbase:037:0> g = Get.new('r1'.to_s.to_java_bytes)}}
{{=> #<Java::OrgApacheHadoopHbaseClient::Get:0x11fa11b2>}}
{{hbase:038:0> result = @hbase.table('t1', @shell).instance_variable_get(:@table).get(g)}}
{{=> #<Java::OrgApacheHadoopHbaseClient::Result:0x217009bd>}}
{{hbase:039:0> result.mayHaveMoreCellsInRow}}
{{=> false}}
{{hbase:040:0> result.toString}}
{{=> "keyvalues=\{r1/f1:c1/1627498270850/Put/vlen=1/seqid=0, r1/f1:c2/1627498276326/Put/vlen=1/seqid=0, r1/f1:c3/1627498280413/Put/vlen=1/seqid=0}"}}
*{{Get with setMaxResultSize, returns first two columns and }}{{mayHaveMoreCellsInRow = true}}*
{{hbase:059:0> g = Get.new('r1'.to_s.to_java_bytes).setMaxResultSize(100)}}
{{=> #<Java::OrgApacheHadoopHbaseClient::Get:0x5ed88e31>}}
{{hbase:060:0> result = @hbase.table('t1', @shell).instance_variable_get(:@table).get(g)}}
{{=> #<Java::OrgApacheHadoopHbaseClient::Result:0x574e4184>}}
{{hbase:061:0> result.mayHaveMoreCellsInRow}}
{{=> true}}
{{hbase:062:0> result.toString}}
{{=> "keyvalues=\{r1/f1:c1/1627498270850/Put/vlen=1/seqid=0, r1/f1:c2/1627498276326/Put/vlen=1/seqid=0}"}}
was (Author: bbeaudreault):
Here's a quick example of this working in the hbase shell:
*Create table*
{{hbase:005:0> create 't1', 'f1'}}
{{Created table t1}}
{{Took 1.1306 seconds}}
*{{Insert test data}}*
{{hbase:012:0> put 't1', 'r1', 'f1:c1', 'a'}}
{{Took 0.0416 seconds}}
{{hbase:014:0> put 't1', 'r1', 'f1:c2', 'b'}}
{{Took 0.0059 seconds}}
{{hbase:015:0> put 't1', 'r1', 'f1:c3', 'c'}}
{{Took 0.0097 seconds}}
*Get without setMaxResultSize, returns full row and {{mayHaveMoreCellsInRow = false}}*{{}}
{{hbase:037:0> g = Get.new('r1'.to_s.to_java_bytes)}}
{{=> #<Java::OrgApacheHadoopHbaseClient::Get:0x11fa11b2>}}
{{hbase:038:0> result = @hbase.table('t1', @shell).instance_variable_get(:@table).get(g)}}
{{=> #<Java::OrgApacheHadoopHbaseClient::Result:0x217009bd>}}
{{hbase:039:0> result.mayHaveMoreCellsInRow}}
{{=> false}}
{{hbase:040:0> result.toString}}
{{=> "keyvalues=\{r1/f1:c1/1627498270850/Put/vlen=1/seqid=0, r1/f1:c2/1627498276326/Put/vlen=1/seqid=0, r1/f1:c3/1627498280413/Put/vlen=1/seqid=0}"}}
*{{Get with setMaxResultSize, returns first two columns and }}{{mayHaveMoreCellsInRow = true}}*{{}}
{{hbase:059:0> g = Get.new('r1'.to_s.to_java_bytes).setMaxResultSize(100)}}
{{=> #<Java::OrgApacheHadoopHbaseClient::Get:0x5ed88e31>}}
{{hbase:060:0> result = @hbase.table('t1', @shell).instance_variable_get(:@table).get(g)}}
{{=> #<Java::OrgApacheHadoopHbaseClient::Result:0x574e4184>}}
{{hbase:061:0> result.mayHaveMoreCellsInRow}}
{{=> true}}
{{hbase:062:0> result.toString}}
{{=> "keyvalues=\{r1/f1:c1/1627498270850/Put/vlen=1/seqid=0, r1/f1:c2/1627498276326/Put/vlen=1/seqid=0}"}}
> Limit max result size of individual Gets
> ----------------------------------------
>
> Key: HBASE-26122
> URL: https://issues.apache.org/jira/browse/HBASE-26122
> Project: HBase
> Issue Type: New Feature
> Components: Client, regionserver
> Reporter: Bryan Beaudreault
> Assignee: Bryan Beaudreault
> Priority: Major
>
> Scans have the ability to have a configured max result size, which causes them to return a partial result once the limit has been reached. MultiGets also can throw MultiActionResultTooLarge if the response size is over a configured quota. Neither of these really accounts for a single Get of a too-large row. Such too-large Gets can cause substantial GC pressure or worse if sent at volume.
> Currently one can work around this by converting their Get to a single row Scan, but this requires a developer to proactively know about and prepare for the issue by using a Scan upfront or wait for the RegionServer to choke on a large request and only then rewrite the Get for future requests.
> We should implement the same response size limits for for Get as for Scan, whereby the server returns a partial result to the client for handling.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)