You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2008/08/13 06:17:44 UTC
[jira] Resolved: (HBASE-808) MAX_VERSIONS not respected.
[ https://issues.apache.org/jira/browse/HBASE-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack resolved HBASE-808.
-------------------------
Resolution: Fixed
Committed on branch and trunk. See if helps with hbase-826. Thanks for patch J-D.
> MAX_VERSIONS not respected.
> ---------------------------
>
> Key: HBASE-808
> URL: https://issues.apache.org/jira/browse/HBASE-808
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Priority: Blocker
> Fix For: 0.2.1, 0.3.0
>
> Attachments: hbase-808-809-v1.patch
>
>
> Below is a report from the list. I confirmed playing in shell that indeed we have this problem. Lets fix for 0.2.1.
> {code}
> Hello.
> I made some tests with HBase 0.2.0 (RC2), focused on insertion and
> timestamps behaviour. I had some surprising results, and I was wondering if
> people using hbase already tried such an usage, and what was their
> conclusion.
> First of all I created a table with the default column attributes, using
> hbase shell
> ## TABLE
> hbase(main):008:0> describe 'proxy-0.2'
> {NAME => 'proxy-0.2', IS_ROOT => 'false', IS_META => 'false', FAMILIES =>
> [{NAME => 'status', BLOOMFILTER => '
> false', IN_MEMORY => 'false', LENGTH => '2147483647', BLOCKCACHE => 'false',
> VERSIONS => '3', TTL => '-1', COM
> PRESSION => 'NONE'}, {NAME => 'header', BLOOMFILTER => 'false', IN_MEMORY =>
> 'false', LENGTH => '2147483647',
> BLOCKCACHE => 'false', VERSIONS => '3', TTL => '-1', COMPRESSION => 'NONE'},
> {NAME => 'bytes', BLOOMFILTER =>
> 'false', IN_MEMORY => 'false', LENGTH => '2147483647', BLOCKCACHE =>
> 'false', VERSIONS => '3', TTL => '-1', CO
> MPRESSION => 'NONE'}, {NAME => 'info', BLOOMFILTER => 'false', IN_MEMORY =>
> 'false', LENGTH => '2147483647', B
> LOCKCACHE => 'false', VERSIONS => '3', TTL => '-1', COMPRESSION => 'NONE'}]}
> Test1
> I make a loop that inserts the same row with different values at different
> timestamps, arbitrary from 1000 incrementing from 10 to 10. I have a method
> for dumping the row history: it makes a query for the last version, and
> queries for past version using the current version timestamp minus 1. Note
> that my table object is created once for entire program life cycle.
> ## GLOBAL CODE
> // somewhere in constructor
> t = new HTable(conf, TABLE_NAME);
> /**
> * Dump reversed history of a HBase row, querying for older version
> * using the max timestamp of all cells -1 until there is no cell returned
> * @param rowKey
> */
> private void dumpRowVersions(String rowKey) {
> Logger.log.info("Versions or row : "+rowKey);
> try {
> // first query. The newest version of the row
> RowResult rr = t.getRow(rowKey);
> int version = 1;
> long maxTs;
>
> do {
> maxTs = -1;
> String line = "";
> // go through all cells of the row
> for (Map.Entry en : rr.entrySet()) {
> long ts = en.getValue().getTimestamp();
> maxTs = Math.max(maxTs, ts);
> line += new String(en.getKey());
> line += " => " + new String(en.getValue().getValue());
> line += " ["+ts+"], ";
> }
> // remove the last coma and space for smarter output
> if (line.length() > 0) {
> line = line.substring(0, line.length()-2);
> }
> // prefix result with a version counter and the max timestamp
> // found in the cells
> line = "#"+version+" MXTS["+maxTs+"] "+line;
> if (maxTs != -1) {
> // there was resulting cell. Continue iteration
> Logger.log.info(line);
>
> // get previous version
> version++;
> rr = t.getRow(rowKey, maxTs-1);
> }
> } while (maxTs != -1);
>
> } catch (IOException ex) {
> throw new IllegalStateException("Cannot fetch history of row
> "+rowKey,ex);
> }
> }
> ## LOOP CODE
> long ts = 1000;
> do {
> // insert the testrow with a new timestamp
> BatchUpdate bu = new BatchUpdate("testrow", ts);
> bu.put("bytes:", ("valbytes ts "+ts).getBytes());
> bu.put("status:", ("valstat ts"+ts).getBytes());
> t.commit(bu);
> Logger.log.info("-- Inserted ts "+ts);
>
> // dump row history
> Thread.sleep(70);
> dumpRowVersions("testrow");
>
> // next iteration in two seconds
> ts += 10;
> Thread.sleep(2000);
> } while (true);
> ## OUTPUT
> > > Connecting to hbase master...
> > -- Inserted ts 1000
> > Versions or row : testrow
> > #1 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
> > -- Inserted ts 1010
> > Versions or row : testrow
> > #1 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
> > #2 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
> > -- Inserted ts 1020
> > Versions or row : testrow
> > #1 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
> > #2 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
> > #3 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
> > -- Inserted ts 1030
> > Versions or row : testrow
> > #1 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
> > #2 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
> > #3 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
> > #4 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
> > -- Inserted ts 1040
> > Versions or row : testrow
> > #1 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
> > #2 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
> > #3 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
> > #4 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
> > #5 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
> > -- Inserted ts 1050
> > Versions or row : testrow
> > #1 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
> > #2 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
> > #3 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
> > #4 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
> > #5 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
> > #6 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
> > -- Inserted ts 1060
> > Versions or row : testrow
> > #1 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
> > #2 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
> > #3 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
> > #4 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
> > #5 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
> > #6 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
> > #7 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
> > -- Inserted ts 1070
> > Versions or row : testrow
> > #1 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
> > #2 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
> > #3 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
> > #4 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
> > #5 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
> > #6 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
> > #7 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
> > #8 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
> > -- Inserted ts 1080
> > Versions or row : testrow
> > #1 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
> ts1080 [1080]
> > #2 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
> > #3 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
> > #4 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
> > #5 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
> > #6 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
> > #7 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
> > #8 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
> > #9 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
> > -- Inserted ts 1090
> > Versions or row : testrow
> > #1 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
> ts1090 [1090]
> > #2 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
> ts1080 [1080]
> > #3 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
> > #4 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
> > #5 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
> > #6 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
> > #7 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
> > #8 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
> > #9 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
> > #10 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
> > -- Inserted ts 1100
> > Versions or row : testrow
> > #1 MXTS[1100] bytes: => valbytes ts 1100 [1100], status: => valstat
> ts1100 [1100]
> > #2 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
> ts1090 [1090]
> > #3 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
> ts1080 [1080]
> > #4 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
> > #5 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
> > #6 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
> > #7 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
> > #8 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
> > #9 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
> > #10 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
> > #11 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
> Despite the VERSIONS parameter of the columns (3) it seems that all versions
> are stored.
> Question: is there some garbage collector process that removes the old
> versions ? if yes, when does it take place ?
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.