You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2008/08/09 00:38:44 UTC

[jira] Created: (HBASE-808) MAX_VERSIONS not respected.

MAX_VERSIONS not respected.
---------------------------

                 Key: HBASE-808
                 URL: https://issues.apache.org/jira/browse/HBASE-808
             Project: Hadoop HBase
          Issue Type: Bug
            Reporter: stack
            Priority: Critical
             Fix For: 0.2.1


Below is a report from the list.  I confirmed playing in shell that indeed we have this problem.  Lets fix for 0.2.1.

{code}
Hello.

I made some tests with HBase 0.2.0 (RC2), focused on insertion and
timestamps behaviour. I had some surprising results, and I was wondering if
people using hbase already tried such an usage, and what was their
conclusion.

First of all I created a table with the default column attributes, using
hbase shell



## TABLE

hbase(main):008:0> describe 'proxy-0.2'
{NAME => 'proxy-0.2', IS_ROOT => 'false', IS_META => 'false', FAMILIES =>
[{NAME => 'status', BLOOMFILTER => '
false', IN_MEMORY => 'false', LENGTH => '2147483647', BLOCKCACHE => 'false',
VERSIONS => '3', TTL => '-1', COM
PRESSION => 'NONE'}, {NAME => 'header', BLOOMFILTER => 'false', IN_MEMORY =>
'false', LENGTH => '2147483647',
BLOCKCACHE => 'false', VERSIONS => '3', TTL => '-1', COMPRESSION => 'NONE'},
{NAME => 'bytes', BLOOMFILTER =>
'false', IN_MEMORY => 'false', LENGTH => '2147483647', BLOCKCACHE =>
'false', VERSIONS => '3', TTL => '-1', CO
MPRESSION => 'NONE'}, {NAME => 'info', BLOOMFILTER => 'false', IN_MEMORY =>
'false', LENGTH => '2147483647', B
LOCKCACHE => 'false', VERSIONS => '3', TTL => '-1', COMPRESSION => 'NONE'}]}


Test1

I make a loop that inserts the same row with different values at different
timestamps, arbitrary from 1000 incrementing from 10 to 10. I have a method
for dumping the row history: it makes a query for the last version, and
queries for past version using the current version timestamp minus 1. Note
that my table object is created once for entire program life cycle.


## GLOBAL CODE

	// somewhere in constructor
	t = new HTable(conf, TABLE_NAME);

	/**
	 * Dump reversed history of a HBase row, querying for older version
	 * using the max timestamp of all cells -1 until there is no cell returned
	 * @param rowKey
	 */
	private void dumpRowVersions(String rowKey) {
		Logger.log.info("Versions or row : "+rowKey);
		try {
			// first query. The newest version of the row
			RowResult rr = t.getRow(rowKey);
			int version = 1;
			long maxTs;
			
			do {
				maxTs = -1;
				String line = "";
				// go through all cells of the row
				for (Map.Entry en : rr.entrySet()) {
					long ts = en.getValue().getTimestamp();
					maxTs = Math.max(maxTs, ts);
					line += new String(en.getKey());
					line += " => " + new String(en.getValue().getValue());
					line += " ["+ts+"], ";
				}

				// remove the last coma and space for smarter output
				if (line.length() > 0) {
					line = line.substring(0, line.length()-2);
				}

				// prefix result with a version counter and the max timestamp 
				// found in the cells
				line = "#"+version+" MXTS["+maxTs+"] "+line;
				if (maxTs != -1) {
					// there was resulting cell. Continue iteration
					Logger.log.info(line);
					
					// get previous version
					version++;
					rr = t.getRow(rowKey, maxTs-1);
				}
			} while (maxTs != -1);
			
		} catch (IOException ex) {
			throw new IllegalStateException("Cannot fetch history of row
"+rowKey,ex);
		}
	}

## LOOP CODE 

			long ts = 1000;
			do {
				// insert the testrow with a new timestamp
				BatchUpdate bu = new BatchUpdate("testrow", ts);
				bu.put("bytes:", ("valbytes ts "+ts).getBytes());
				bu.put("status:", ("valstat ts"+ts).getBytes());
				t.commit(bu);
				Logger.log.info("-- Inserted ts "+ts);
				
				// dump row history
				Thread.sleep(70);
				dumpRowVersions("testrow");
				
				// next iteration in two seconds
				ts += 10;
				Thread.sleep(2000);
			} while (true);

## OUTPUT

> > Connecting to hbase master...
 > -- Inserted ts 1000
 > Versions or row : testrow
 > #1 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
ts1000 [1000]
 > -- Inserted ts 1010
 > Versions or row : testrow
 > #1 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
ts1010 [1010]
 > #2 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
ts1000 [1000]
 > -- Inserted ts 1020
 > Versions or row : testrow
 > #1 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
ts1020 [1020]
 > #2 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
ts1010 [1010]
 > #3 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
ts1000 [1000]
 > -- Inserted ts 1030
 > Versions or row : testrow
 > #1 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
ts1030 [1030]
 > #2 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
ts1020 [1020]
 > #3 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
ts1010 [1010]
 > #4 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
ts1000 [1000]
 > -- Inserted ts 1040
 > Versions or row : testrow
 > #1 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
ts1040 [1040]
 > #2 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
ts1030 [1030]
 > #3 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
ts1020 [1020]
 > #4 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
ts1010 [1010]
 > #5 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
ts1000 [1000]
 > -- Inserted ts 1050
 > Versions or row : testrow
 > #1 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
ts1050 [1050]
 > #2 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
ts1040 [1040]
 > #3 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
ts1030 [1030]
 > #4 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
ts1020 [1020]
 > #5 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
ts1010 [1010]
 > #6 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
ts1000 [1000]
 > -- Inserted ts 1060
 > Versions or row : testrow
 > #1 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
ts1060 [1060]
 > #2 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
ts1050 [1050]
 > #3 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
ts1040 [1040]
 > #4 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
ts1030 [1030]
 > #5 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
ts1020 [1020]
 > #6 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
ts1010 [1010]
 > #7 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
ts1000 [1000]
 > -- Inserted ts 1070
 > Versions or row : testrow
 > #1 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
ts1070 [1070]
 > #2 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
ts1060 [1060]
 > #3 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
ts1050 [1050]
 > #4 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
ts1040 [1040]
 > #5 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
ts1030 [1030]
 > #6 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
ts1020 [1020]
 > #7 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
ts1010 [1010]
 > #8 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
ts1000 [1000]
 > -- Inserted ts 1080
 > Versions or row : testrow
 > #1 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
ts1080 [1080]
 > #2 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
ts1070 [1070]
 > #3 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
ts1060 [1060]
 > #4 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
ts1050 [1050]
 > #5 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
ts1040 [1040]
 > #6 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
ts1030 [1030]
 > #7 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
ts1020 [1020]
 > #8 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
ts1010 [1010]
 > #9 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
ts1000 [1000]
 > -- Inserted ts 1090
 > Versions or row : testrow
 > #1 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
ts1090 [1090]
 > #2 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
ts1080 [1080]
 > #3 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
ts1070 [1070]
 > #4 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
ts1060 [1060]
 > #5 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
ts1050 [1050]
 > #6 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
ts1040 [1040]
 > #7 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
ts1030 [1030]
 > #8 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
ts1020 [1020]
 > #9 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
ts1010 [1010]
 > #10 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
ts1000 [1000]
 > -- Inserted ts 1100
 > Versions or row : testrow
 > #1 MXTS[1100] bytes: => valbytes ts 1100 [1100], status: => valstat
ts1100 [1100]
 > #2 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
ts1090 [1090]
 > #3 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
ts1080 [1080]
 > #4 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
ts1070 [1070]
 > #5 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
ts1060 [1060]
 > #6 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
ts1050 [1050]
 > #7 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
ts1040 [1040]
 > #8 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
ts1030 [1030]
 > #9 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
ts1020 [1020]
 > #10 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
ts1010 [1010]
 > #11 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
ts1000 [1000]


Despite the VERSIONS parameter of the columns (3) it seems that all versions
are stored. 

Question: is there some garbage collector process that removes the old
versions ? if yes, when does it take place ?

{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-808) MAX_VERSIONS not respected.

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-808.
-------------------------

    Resolution: Fixed

Committed on branch and trunk.  See if helps with hbase-826.  Thanks for patch J-D.

> MAX_VERSIONS not respected.
> ---------------------------
>
>                 Key: HBASE-808
>                 URL: https://issues.apache.org/jira/browse/HBASE-808
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Blocker
>             Fix For: 0.2.1, 0.3.0
>
>         Attachments: hbase-808-809-v1.patch
>
>
> Below is a report from the list.  I confirmed playing in shell that indeed we have this problem.  Lets fix for 0.2.1.
> {code}
> Hello.
> I made some tests with HBase 0.2.0 (RC2), focused on insertion and
> timestamps behaviour. I had some surprising results, and I was wondering if
> people using hbase already tried such an usage, and what was their
> conclusion.
> First of all I created a table with the default column attributes, using
> hbase shell
> ## TABLE
> hbase(main):008:0> describe 'proxy-0.2'
> {NAME => 'proxy-0.2', IS_ROOT => 'false', IS_META => 'false', FAMILIES =>
> [{NAME => 'status', BLOOMFILTER => '
> false', IN_MEMORY => 'false', LENGTH => '2147483647', BLOCKCACHE => 'false',
> VERSIONS => '3', TTL => '-1', COM
> PRESSION => 'NONE'}, {NAME => 'header', BLOOMFILTER => 'false', IN_MEMORY =>
> 'false', LENGTH => '2147483647',
> BLOCKCACHE => 'false', VERSIONS => '3', TTL => '-1', COMPRESSION => 'NONE'},
> {NAME => 'bytes', BLOOMFILTER =>
> 'false', IN_MEMORY => 'false', LENGTH => '2147483647', BLOCKCACHE =>
> 'false', VERSIONS => '3', TTL => '-1', CO
> MPRESSION => 'NONE'}, {NAME => 'info', BLOOMFILTER => 'false', IN_MEMORY =>
> 'false', LENGTH => '2147483647', B
> LOCKCACHE => 'false', VERSIONS => '3', TTL => '-1', COMPRESSION => 'NONE'}]}
> Test1
> I make a loop that inserts the same row with different values at different
> timestamps, arbitrary from 1000 incrementing from 10 to 10. I have a method
> for dumping the row history: it makes a query for the last version, and
> queries for past version using the current version timestamp minus 1. Note
> that my table object is created once for entire program life cycle.
> ## GLOBAL CODE
> 	// somewhere in constructor
> 	t = new HTable(conf, TABLE_NAME);
> 	/**
> 	 * Dump reversed history of a HBase row, querying for older version
> 	 * using the max timestamp of all cells -1 until there is no cell returned
> 	 * @param rowKey
> 	 */
> 	private void dumpRowVersions(String rowKey) {
> 		Logger.log.info("Versions or row : "+rowKey);
> 		try {
> 			// first query. The newest version of the row
> 			RowResult rr = t.getRow(rowKey);
> 			int version = 1;
> 			long maxTs;
> 			
> 			do {
> 				maxTs = -1;
> 				String line = "";
> 				// go through all cells of the row
> 				for (Map.Entry en : rr.entrySet()) {
> 					long ts = en.getValue().getTimestamp();
> 					maxTs = Math.max(maxTs, ts);
> 					line += new String(en.getKey());
> 					line += " => " + new String(en.getValue().getValue());
> 					line += " ["+ts+"], ";
> 				}
> 				// remove the last coma and space for smarter output
> 				if (line.length() > 0) {
> 					line = line.substring(0, line.length()-2);
> 				}
> 				// prefix result with a version counter and the max timestamp 
> 				// found in the cells
> 				line = "#"+version+" MXTS["+maxTs+"] "+line;
> 				if (maxTs != -1) {
> 					// there was resulting cell. Continue iteration
> 					Logger.log.info(line);
> 					
> 					// get previous version
> 					version++;
> 					rr = t.getRow(rowKey, maxTs-1);
> 				}
> 			} while (maxTs != -1);
> 			
> 		} catch (IOException ex) {
> 			throw new IllegalStateException("Cannot fetch history of row
> "+rowKey,ex);
> 		}
> 	}
> ## LOOP CODE 
> 			long ts = 1000;
> 			do {
> 				// insert the testrow with a new timestamp
> 				BatchUpdate bu = new BatchUpdate("testrow", ts);
> 				bu.put("bytes:", ("valbytes ts "+ts).getBytes());
> 				bu.put("status:", ("valstat ts"+ts).getBytes());
> 				t.commit(bu);
> 				Logger.log.info("-- Inserted ts "+ts);
> 				
> 				// dump row history
> 				Thread.sleep(70);
> 				dumpRowVersions("testrow");
> 				
> 				// next iteration in two seconds
> 				ts += 10;
> 				Thread.sleep(2000);
> 			} while (true);
> ## OUTPUT
> > > Connecting to hbase master...
>  > -- Inserted ts 1000
>  > Versions or row : testrow
>  > #1 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1010
>  > Versions or row : testrow
>  > #1 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #2 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1020
>  > Versions or row : testrow
>  > #1 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #2 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #3 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1030
>  > Versions or row : testrow
>  > #1 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #2 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #3 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #4 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1040
>  > Versions or row : testrow
>  > #1 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #2 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #3 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #4 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #5 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1050
>  > Versions or row : testrow
>  > #1 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #2 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #3 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #4 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #5 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #6 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1060
>  > Versions or row : testrow
>  > #1 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #2 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #3 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #4 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #5 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #6 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #7 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1070
>  > Versions or row : testrow
>  > #1 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
>  > #2 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #3 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #4 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #5 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #6 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #7 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #8 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1080
>  > Versions or row : testrow
>  > #1 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
> ts1080 [1080]
>  > #2 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
>  > #3 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #4 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #5 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #6 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #7 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #8 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #9 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1090
>  > Versions or row : testrow
>  > #1 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
> ts1090 [1090]
>  > #2 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
> ts1080 [1080]
>  > #3 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
>  > #4 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #5 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #6 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #7 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #8 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #9 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #10 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1100
>  > Versions or row : testrow
>  > #1 MXTS[1100] bytes: => valbytes ts 1100 [1100], status: => valstat
> ts1100 [1100]
>  > #2 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
> ts1090 [1090]
>  > #3 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
> ts1080 [1080]
>  > #4 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
>  > #5 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #6 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #7 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #8 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #9 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #10 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #11 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
> Despite the VERSIONS parameter of the columns (3) it seems that all versions
> are stored. 
> Question: is there some garbage collector process that removes the old
> versions ? if yes, when does it take place ?
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-808) MAX_VERSIONS not respected.

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12621104#action_12621104 ] 

stack commented on HBASE-808:
-----------------------------

For sure, this stuff is broke.  The Memcache handling does not pay attention to column family/store MAX_VERSIONS.

> MAX_VERSIONS not respected.
> ---------------------------
>
>                 Key: HBASE-808
>                 URL: https://issues.apache.org/jira/browse/HBASE-808
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.1
>
>
> Below is a report from the list.  I confirmed playing in shell that indeed we have this problem.  Lets fix for 0.2.1.
> {code}
> Hello.
> I made some tests with HBase 0.2.0 (RC2), focused on insertion and
> timestamps behaviour. I had some surprising results, and I was wondering if
> people using hbase already tried such an usage, and what was their
> conclusion.
> First of all I created a table with the default column attributes, using
> hbase shell
> ## TABLE
> hbase(main):008:0> describe 'proxy-0.2'
> {NAME => 'proxy-0.2', IS_ROOT => 'false', IS_META => 'false', FAMILIES =>
> [{NAME => 'status', BLOOMFILTER => '
> false', IN_MEMORY => 'false', LENGTH => '2147483647', BLOCKCACHE => 'false',
> VERSIONS => '3', TTL => '-1', COM
> PRESSION => 'NONE'}, {NAME => 'header', BLOOMFILTER => 'false', IN_MEMORY =>
> 'false', LENGTH => '2147483647',
> BLOCKCACHE => 'false', VERSIONS => '3', TTL => '-1', COMPRESSION => 'NONE'},
> {NAME => 'bytes', BLOOMFILTER =>
> 'false', IN_MEMORY => 'false', LENGTH => '2147483647', BLOCKCACHE =>
> 'false', VERSIONS => '3', TTL => '-1', CO
> MPRESSION => 'NONE'}, {NAME => 'info', BLOOMFILTER => 'false', IN_MEMORY =>
> 'false', LENGTH => '2147483647', B
> LOCKCACHE => 'false', VERSIONS => '3', TTL => '-1', COMPRESSION => 'NONE'}]}
> Test1
> I make a loop that inserts the same row with different values at different
> timestamps, arbitrary from 1000 incrementing from 10 to 10. I have a method
> for dumping the row history: it makes a query for the last version, and
> queries for past version using the current version timestamp minus 1. Note
> that my table object is created once for entire program life cycle.
> ## GLOBAL CODE
> 	// somewhere in constructor
> 	t = new HTable(conf, TABLE_NAME);
> 	/**
> 	 * Dump reversed history of a HBase row, querying for older version
> 	 * using the max timestamp of all cells -1 until there is no cell returned
> 	 * @param rowKey
> 	 */
> 	private void dumpRowVersions(String rowKey) {
> 		Logger.log.info("Versions or row : "+rowKey);
> 		try {
> 			// first query. The newest version of the row
> 			RowResult rr = t.getRow(rowKey);
> 			int version = 1;
> 			long maxTs;
> 			
> 			do {
> 				maxTs = -1;
> 				String line = "";
> 				// go through all cells of the row
> 				for (Map.Entry en : rr.entrySet()) {
> 					long ts = en.getValue().getTimestamp();
> 					maxTs = Math.max(maxTs, ts);
> 					line += new String(en.getKey());
> 					line += " => " + new String(en.getValue().getValue());
> 					line += " ["+ts+"], ";
> 				}
> 				// remove the last coma and space for smarter output
> 				if (line.length() > 0) {
> 					line = line.substring(0, line.length()-2);
> 				}
> 				// prefix result with a version counter and the max timestamp 
> 				// found in the cells
> 				line = "#"+version+" MXTS["+maxTs+"] "+line;
> 				if (maxTs != -1) {
> 					// there was resulting cell. Continue iteration
> 					Logger.log.info(line);
> 					
> 					// get previous version
> 					version++;
> 					rr = t.getRow(rowKey, maxTs-1);
> 				}
> 			} while (maxTs != -1);
> 			
> 		} catch (IOException ex) {
> 			throw new IllegalStateException("Cannot fetch history of row
> "+rowKey,ex);
> 		}
> 	}
> ## LOOP CODE 
> 			long ts = 1000;
> 			do {
> 				// insert the testrow with a new timestamp
> 				BatchUpdate bu = new BatchUpdate("testrow", ts);
> 				bu.put("bytes:", ("valbytes ts "+ts).getBytes());
> 				bu.put("status:", ("valstat ts"+ts).getBytes());
> 				t.commit(bu);
> 				Logger.log.info("-- Inserted ts "+ts);
> 				
> 				// dump row history
> 				Thread.sleep(70);
> 				dumpRowVersions("testrow");
> 				
> 				// next iteration in two seconds
> 				ts += 10;
> 				Thread.sleep(2000);
> 			} while (true);
> ## OUTPUT
> > > Connecting to hbase master...
>  > -- Inserted ts 1000
>  > Versions or row : testrow
>  > #1 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1010
>  > Versions or row : testrow
>  > #1 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #2 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1020
>  > Versions or row : testrow
>  > #1 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #2 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #3 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1030
>  > Versions or row : testrow
>  > #1 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #2 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #3 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #4 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1040
>  > Versions or row : testrow
>  > #1 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #2 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #3 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #4 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #5 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1050
>  > Versions or row : testrow
>  > #1 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #2 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #3 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #4 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #5 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #6 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1060
>  > Versions or row : testrow
>  > #1 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #2 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #3 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #4 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #5 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #6 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #7 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1070
>  > Versions or row : testrow
>  > #1 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
>  > #2 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #3 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #4 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #5 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #6 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #7 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #8 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1080
>  > Versions or row : testrow
>  > #1 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
> ts1080 [1080]
>  > #2 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
>  > #3 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #4 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #5 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #6 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #7 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #8 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #9 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1090
>  > Versions or row : testrow
>  > #1 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
> ts1090 [1090]
>  > #2 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
> ts1080 [1080]
>  > #3 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
>  > #4 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #5 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #6 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #7 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #8 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #9 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #10 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1100
>  > Versions or row : testrow
>  > #1 MXTS[1100] bytes: => valbytes ts 1100 [1100], status: => valstat
> ts1100 [1100]
>  > #2 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
> ts1090 [1090]
>  > #3 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
> ts1080 [1080]
>  > #4 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
>  > #5 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #6 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #7 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #8 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #9 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #10 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #11 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
> Despite the VERSIONS parameter of the columns (3) it seems that all versions
> are stored. 
> Question: is there some garbage collector process that removes the old
> versions ? if yes, when does it take place ?
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-808) MAX_VERSIONS not respected.

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-808:
-------------------------------------

    Attachment: hbase-808-809-v1.patch

This patch fixes both HBASE-808 and HBASE-809. I worked extensively with stack on this one. It fixes a hole that we didn't check if there were deletes in snapshots. Also adds a new test based on Jean-Adrien testing and checks in Memcache to make sure we honor MAX_VERSIONS.

> MAX_VERSIONS not respected.
> ---------------------------
>
>                 Key: HBASE-808
>                 URL: https://issues.apache.org/jira/browse/HBASE-808
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.1
>
>         Attachments: hbase-808-809-v1.patch
>
>
> Below is a report from the list.  I confirmed playing in shell that indeed we have this problem.  Lets fix for 0.2.1.
> {code}
> Hello.
> I made some tests with HBase 0.2.0 (RC2), focused on insertion and
> timestamps behaviour. I had some surprising results, and I was wondering if
> people using hbase already tried such an usage, and what was their
> conclusion.
> First of all I created a table with the default column attributes, using
> hbase shell
> ## TABLE
> hbase(main):008:0> describe 'proxy-0.2'
> {NAME => 'proxy-0.2', IS_ROOT => 'false', IS_META => 'false', FAMILIES =>
> [{NAME => 'status', BLOOMFILTER => '
> false', IN_MEMORY => 'false', LENGTH => '2147483647', BLOCKCACHE => 'false',
> VERSIONS => '3', TTL => '-1', COM
> PRESSION => 'NONE'}, {NAME => 'header', BLOOMFILTER => 'false', IN_MEMORY =>
> 'false', LENGTH => '2147483647',
> BLOCKCACHE => 'false', VERSIONS => '3', TTL => '-1', COMPRESSION => 'NONE'},
> {NAME => 'bytes', BLOOMFILTER =>
> 'false', IN_MEMORY => 'false', LENGTH => '2147483647', BLOCKCACHE =>
> 'false', VERSIONS => '3', TTL => '-1', CO
> MPRESSION => 'NONE'}, {NAME => 'info', BLOOMFILTER => 'false', IN_MEMORY =>
> 'false', LENGTH => '2147483647', B
> LOCKCACHE => 'false', VERSIONS => '3', TTL => '-1', COMPRESSION => 'NONE'}]}
> Test1
> I make a loop that inserts the same row with different values at different
> timestamps, arbitrary from 1000 incrementing from 10 to 10. I have a method
> for dumping the row history: it makes a query for the last version, and
> queries for past version using the current version timestamp minus 1. Note
> that my table object is created once for entire program life cycle.
> ## GLOBAL CODE
> 	// somewhere in constructor
> 	t = new HTable(conf, TABLE_NAME);
> 	/**
> 	 * Dump reversed history of a HBase row, querying for older version
> 	 * using the max timestamp of all cells -1 until there is no cell returned
> 	 * @param rowKey
> 	 */
> 	private void dumpRowVersions(String rowKey) {
> 		Logger.log.info("Versions or row : "+rowKey);
> 		try {
> 			// first query. The newest version of the row
> 			RowResult rr = t.getRow(rowKey);
> 			int version = 1;
> 			long maxTs;
> 			
> 			do {
> 				maxTs = -1;
> 				String line = "";
> 				// go through all cells of the row
> 				for (Map.Entry en : rr.entrySet()) {
> 					long ts = en.getValue().getTimestamp();
> 					maxTs = Math.max(maxTs, ts);
> 					line += new String(en.getKey());
> 					line += " => " + new String(en.getValue().getValue());
> 					line += " ["+ts+"], ";
> 				}
> 				// remove the last coma and space for smarter output
> 				if (line.length() > 0) {
> 					line = line.substring(0, line.length()-2);
> 				}
> 				// prefix result with a version counter and the max timestamp 
> 				// found in the cells
> 				line = "#"+version+" MXTS["+maxTs+"] "+line;
> 				if (maxTs != -1) {
> 					// there was resulting cell. Continue iteration
> 					Logger.log.info(line);
> 					
> 					// get previous version
> 					version++;
> 					rr = t.getRow(rowKey, maxTs-1);
> 				}
> 			} while (maxTs != -1);
> 			
> 		} catch (IOException ex) {
> 			throw new IllegalStateException("Cannot fetch history of row
> "+rowKey,ex);
> 		}
> 	}
> ## LOOP CODE 
> 			long ts = 1000;
> 			do {
> 				// insert the testrow with a new timestamp
> 				BatchUpdate bu = new BatchUpdate("testrow", ts);
> 				bu.put("bytes:", ("valbytes ts "+ts).getBytes());
> 				bu.put("status:", ("valstat ts"+ts).getBytes());
> 				t.commit(bu);
> 				Logger.log.info("-- Inserted ts "+ts);
> 				
> 				// dump row history
> 				Thread.sleep(70);
> 				dumpRowVersions("testrow");
> 				
> 				// next iteration in two seconds
> 				ts += 10;
> 				Thread.sleep(2000);
> 			} while (true);
> ## OUTPUT
> > > Connecting to hbase master...
>  > -- Inserted ts 1000
>  > Versions or row : testrow
>  > #1 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1010
>  > Versions or row : testrow
>  > #1 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #2 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1020
>  > Versions or row : testrow
>  > #1 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #2 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #3 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1030
>  > Versions or row : testrow
>  > #1 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #2 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #3 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #4 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1040
>  > Versions or row : testrow
>  > #1 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #2 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #3 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #4 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #5 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1050
>  > Versions or row : testrow
>  > #1 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #2 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #3 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #4 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #5 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #6 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1060
>  > Versions or row : testrow
>  > #1 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #2 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #3 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #4 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #5 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #6 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #7 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1070
>  > Versions or row : testrow
>  > #1 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
>  > #2 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #3 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #4 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #5 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #6 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #7 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #8 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1080
>  > Versions or row : testrow
>  > #1 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
> ts1080 [1080]
>  > #2 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
>  > #3 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #4 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #5 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #6 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #7 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #8 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #9 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1090
>  > Versions or row : testrow
>  > #1 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
> ts1090 [1090]
>  > #2 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
> ts1080 [1080]
>  > #3 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
>  > #4 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #5 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #6 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #7 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #8 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #9 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #10 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1100
>  > Versions or row : testrow
>  > #1 MXTS[1100] bytes: => valbytes ts 1100 [1100], status: => valstat
> ts1100 [1100]
>  > #2 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
> ts1090 [1090]
>  > #3 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
> ts1080 [1080]
>  > #4 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
>  > #5 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #6 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #7 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #8 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #9 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #10 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #11 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
> Despite the VERSIONS parameter of the columns (3) it seems that all versions
> are stored. 
> Question: is there some garbage collector process that removes the old
> versions ? if yes, when does it take place ?
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-808) MAX_VERSIONS not respected.

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-808:
------------------------

    Fix Version/s: 0.3.0
         Priority: Blocker  (was: Critical)

Making this a blocker on 0.2.1 and 0.3.0. Reviewed patch and it looks good.  Weird we haven't run into this issue before.  Hopefully Jean-Adrien tries this patch and reports back +/-1

> MAX_VERSIONS not respected.
> ---------------------------
>
>                 Key: HBASE-808
>                 URL: https://issues.apache.org/jira/browse/HBASE-808
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Blocker
>             Fix For: 0.2.1, 0.3.0
>
>         Attachments: hbase-808-809-v1.patch
>
>
> Below is a report from the list.  I confirmed playing in shell that indeed we have this problem.  Lets fix for 0.2.1.
> {code}
> Hello.
> I made some tests with HBase 0.2.0 (RC2), focused on insertion and
> timestamps behaviour. I had some surprising results, and I was wondering if
> people using hbase already tried such an usage, and what was their
> conclusion.
> First of all I created a table with the default column attributes, using
> hbase shell
> ## TABLE
> hbase(main):008:0> describe 'proxy-0.2'
> {NAME => 'proxy-0.2', IS_ROOT => 'false', IS_META => 'false', FAMILIES =>
> [{NAME => 'status', BLOOMFILTER => '
> false', IN_MEMORY => 'false', LENGTH => '2147483647', BLOCKCACHE => 'false',
> VERSIONS => '3', TTL => '-1', COM
> PRESSION => 'NONE'}, {NAME => 'header', BLOOMFILTER => 'false', IN_MEMORY =>
> 'false', LENGTH => '2147483647',
> BLOCKCACHE => 'false', VERSIONS => '3', TTL => '-1', COMPRESSION => 'NONE'},
> {NAME => 'bytes', BLOOMFILTER =>
> 'false', IN_MEMORY => 'false', LENGTH => '2147483647', BLOCKCACHE =>
> 'false', VERSIONS => '3', TTL => '-1', CO
> MPRESSION => 'NONE'}, {NAME => 'info', BLOOMFILTER => 'false', IN_MEMORY =>
> 'false', LENGTH => '2147483647', B
> LOCKCACHE => 'false', VERSIONS => '3', TTL => '-1', COMPRESSION => 'NONE'}]}
> Test1
> I make a loop that inserts the same row with different values at different
> timestamps, arbitrary from 1000 incrementing from 10 to 10. I have a method
> for dumping the row history: it makes a query for the last version, and
> queries for past version using the current version timestamp minus 1. Note
> that my table object is created once for entire program life cycle.
> ## GLOBAL CODE
> 	// somewhere in constructor
> 	t = new HTable(conf, TABLE_NAME);
> 	/**
> 	 * Dump reversed history of a HBase row, querying for older version
> 	 * using the max timestamp of all cells -1 until there is no cell returned
> 	 * @param rowKey
> 	 */
> 	private void dumpRowVersions(String rowKey) {
> 		Logger.log.info("Versions or row : "+rowKey);
> 		try {
> 			// first query. The newest version of the row
> 			RowResult rr = t.getRow(rowKey);
> 			int version = 1;
> 			long maxTs;
> 			
> 			do {
> 				maxTs = -1;
> 				String line = "";
> 				// go through all cells of the row
> 				for (Map.Entry en : rr.entrySet()) {
> 					long ts = en.getValue().getTimestamp();
> 					maxTs = Math.max(maxTs, ts);
> 					line += new String(en.getKey());
> 					line += " => " + new String(en.getValue().getValue());
> 					line += " ["+ts+"], ";
> 				}
> 				// remove the last coma and space for smarter output
> 				if (line.length() > 0) {
> 					line = line.substring(0, line.length()-2);
> 				}
> 				// prefix result with a version counter and the max timestamp 
> 				// found in the cells
> 				line = "#"+version+" MXTS["+maxTs+"] "+line;
> 				if (maxTs != -1) {
> 					// there was resulting cell. Continue iteration
> 					Logger.log.info(line);
> 					
> 					// get previous version
> 					version++;
> 					rr = t.getRow(rowKey, maxTs-1);
> 				}
> 			} while (maxTs != -1);
> 			
> 		} catch (IOException ex) {
> 			throw new IllegalStateException("Cannot fetch history of row
> "+rowKey,ex);
> 		}
> 	}
> ## LOOP CODE 
> 			long ts = 1000;
> 			do {
> 				// insert the testrow with a new timestamp
> 				BatchUpdate bu = new BatchUpdate("testrow", ts);
> 				bu.put("bytes:", ("valbytes ts "+ts).getBytes());
> 				bu.put("status:", ("valstat ts"+ts).getBytes());
> 				t.commit(bu);
> 				Logger.log.info("-- Inserted ts "+ts);
> 				
> 				// dump row history
> 				Thread.sleep(70);
> 				dumpRowVersions("testrow");
> 				
> 				// next iteration in two seconds
> 				ts += 10;
> 				Thread.sleep(2000);
> 			} while (true);
> ## OUTPUT
> > > Connecting to hbase master...
>  > -- Inserted ts 1000
>  > Versions or row : testrow
>  > #1 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1010
>  > Versions or row : testrow
>  > #1 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #2 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1020
>  > Versions or row : testrow
>  > #1 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #2 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #3 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1030
>  > Versions or row : testrow
>  > #1 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #2 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #3 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #4 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1040
>  > Versions or row : testrow
>  > #1 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #2 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #3 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #4 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #5 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1050
>  > Versions or row : testrow
>  > #1 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #2 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #3 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #4 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #5 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #6 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1060
>  > Versions or row : testrow
>  > #1 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #2 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #3 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #4 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #5 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #6 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #7 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1070
>  > Versions or row : testrow
>  > #1 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
>  > #2 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #3 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #4 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #5 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #6 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #7 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #8 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1080
>  > Versions or row : testrow
>  > #1 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
> ts1080 [1080]
>  > #2 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
>  > #3 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #4 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #5 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #6 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #7 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #8 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #9 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1090
>  > Versions or row : testrow
>  > #1 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
> ts1090 [1090]
>  > #2 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
> ts1080 [1080]
>  > #3 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
>  > #4 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #5 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #6 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #7 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #8 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #9 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #10 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
>  > -- Inserted ts 1100
>  > Versions or row : testrow
>  > #1 MXTS[1100] bytes: => valbytes ts 1100 [1100], status: => valstat
> ts1100 [1100]
>  > #2 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
> ts1090 [1090]
>  > #3 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
> ts1080 [1080]
>  > #4 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
> ts1070 [1070]
>  > #5 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
> ts1060 [1060]
>  > #6 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
> ts1050 [1050]
>  > #7 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
> ts1040 [1040]
>  > #8 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
> ts1030 [1030]
>  > #9 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
> ts1020 [1020]
>  > #10 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
> ts1010 [1010]
>  > #11 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
> ts1000 [1000]
> Despite the VERSIONS parameter of the columns (3) it seems that all versions
> are stored. 
> Question: is there some garbage collector process that removes the old
> versions ? if yes, when does it take place ?
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.