You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Lewis John Mcgibbney <le...@gmail.com> on 2012/01/09 21:14:53 UTC
How does HBase treat end keys?
Hi,
Whilst working on some tests for Apache Gora, we've discovered a problem
with one of them. The following test [1], which I have also pasted below
(I've made the area if code we are concerned with *bold* to try and point
it out clearly), expects the last key in a range that was deleted to be
present. The developer that reported the issue believes that the end key in
a query should be inclusive, but our test treats as exclusive. Having
searched the mailing lists [2] I am still not 100% certain what HBase's
behaviour is... I wonder if someone can clarify this for me and I can make
the commit accordingly to get the test working properly.
Thank you very much in advance for any information/direction on this one.
Kind Regards
Lewis
[1]
http://svn.apache.org/viewvc/incubator/gora/trunk/gora-core/src/test/java/org/apache/gora/store/DataStoreTestUtil.java?view=markup
[2]
http://article.gmane.org/gmane.comp.java.hadoop.hbase.user/7017/match=endkey+exclusive
----------------------------------------------
public static void testDeleteByQueryFields(DataStore<String, WebPage>
store)
throws IOException {
Query<String, WebPage> query;
//test 5 - delete all with some fields
WebPageDataCreator.createWebPageData(store);
query = store.newQuery();
query.setFields(WebPage.Field.OUTLINKS.getName()
, WebPage.Field.PARSED_CONTENT.getName(),
WebPage.Field.CONTENT.getName());
assertNumResults(store.newQuery(), URLS.length);
store.deleteByQuery(query);
store.deleteByQuery(query);
store.deleteByQuery(query);//don't you love that HBase sometimes does
not delete arbitrarily
store.flush();
assertNumResults(store.newQuery(), URLS.length);
//assert that data is deleted
for (int i = 0; i < SORTED_URLS.length; i++) {
WebPage page = store.get(SORTED_URLS[i]);
Assert.assertNotNull(page);
Assert.assertNotNull(page.getUrl());
Assert.assertEquals(page.getUrl().toString(), SORTED_URLS[i]);
Assert.assertEquals(0, page.getOutlinks().size());
Assert.assertEquals(0, page.getParsedContent().size());
if(page.getContent() != null) {
System.out.println("url:" + page.getUrl().toString());
System.out.println( "limit:" + page.getContent().limit());
} else {
Assert.assertNull(page.getContent());
}
}
//test 6 - delete some with some fields
WebPageDataCreator.createWebPageData(store);
query = store.newQuery();
query.setFields(WebPage.Field.URL.getName());
String startKey = SORTED_URLS[NUM_KEYS];
String endKey = SORTED_URLS[SORTED_URLS.length - NUM_KEYS];
query.setStartKey(startKey);
query.setEndKey(endKey);
assertNumResults(store.newQuery(), URLS.length);
store.deleteByQuery(query);
store.deleteByQuery(query);
store.deleteByQuery(query);//don't you love that HBase sometimes does
not delete arbitrarily
store.flush();
assertNumResults(store.newQuery(), URLS.length);
//assert that data is deleted
for (int i = 0; i < URLS.length; i++) {
WebPage page = store.get(URLS[i]);
Assert.assertNotNull(page);
* if( URLS[i].compareTo(startKey) < 0 || URLS[i].compareTo(endKey) > 0)
{ *
//not deleted
assertWebPage(page, i);
} else {
//deleted
Assert.assertNull(page.getUrl());
Assert.assertNotNull(page.getOutlinks());
Assert.assertNotNull(page.getParsedContent());
Assert.assertNotNull(page.getContent());
Assert.assertTrue(page.getOutlinks().size() > 0);
Assert.assertTrue(page.getParsedContent().size() > 0);
}
}
--
Lewis
Re: How does HBase treat end keys?
Posted by lars hofhansl <lh...@yahoo.com>.
If you needed to make it inclusive you can add a trailing 0 byte to the byte[] passed to setStopRow.
-- Lars
________________________________
From: Lewis John Mcgibbney <le...@gmail.com>
To: user@hbase.apache.org
Sent: Monday, January 9, 2012 12:46 PM
Subject: Re: How does HBase treat end keys?
Thank you Jean-Daniel, great help.
Regards
Lewis
On Mon, Jan 9, 2012 at 8:19 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
> From Scan's javadoc:
>
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setStopRow(byte[])
>
> stopRow - row to end at (exclusive)
>
> Hope this helps,
>
> J-D
>
> On Mon, Jan 9, 2012 at 12:14 PM, Lewis John Mcgibbney
> <le...@gmail.com> wrote:
> > Hi,
> >
> > Whilst working on some tests for Apache Gora, we've discovered a problem
> > with one of them. The following test [1], which I have also pasted below
> > (I've made the area if code we are concerned with *bold* to try and point
> > it out clearly), expects the last key in a range that was deleted to be
> > present. The developer that reported the issue believes that the end key
> in
> > a query should be inclusive, but our test treats as exclusive. Having
> > searched the mailing lists [2] I am still not 100% certain what HBase's
> > behaviour is... I wonder if someone can clarify this for me and I can
> make
> > the commit accordingly to get the test working properly.
> >
> > Thank you very much in advance for any information/direction on this one.
> >
> > Kind Regards
> >
> > Lewis
> >
> > [1]
> >
> http://svn.apache.org/viewvc/incubator/gora/trunk/gora-core/src/test/java/org/apache/gora/store/DataStoreTestUtil.java?view=markup
> > [2]
> >
> http://article.gmane.org/gmane.comp.java.hadoop.hbase.user/7017/match=endkey+exclusive
> > ----------------------------------------------
> >
> > public static void testDeleteByQueryFields(DataStore<String, WebPage>
> > store)
> > throws IOException {
> >
> > Query<String, WebPage> query;
> >
> > //test 5 - delete all with some fields
> > WebPageDataCreator.createWebPageData(store);
> >
> > query = store.newQuery();
> > query.setFields(WebPage.Field.OUTLINKS.getName()
> > , WebPage.Field.PARSED_CONTENT.getName(),
> > WebPage.Field.CONTENT.getName());
> >
> > assertNumResults(store.newQuery(), URLS.length);
> > store.deleteByQuery(query);
> > store.deleteByQuery(query);
> > store.deleteByQuery(query);//don't you love that HBase sometimes does
> > not delete arbitrarily
> >
> > store.flush();
> >
> > assertNumResults(store.newQuery(), URLS.length);
> >
> > //assert that data is deleted
> > for (int i = 0; i < SORTED_URLS.length; i++) {
> > WebPage page = store.get(SORTED_URLS[i]);
> > Assert.assertNotNull(page);
> >
> > Assert.assertNotNull(page.getUrl());
> > Assert.assertEquals(page.getUrl().toString(), SORTED_URLS[i]);
> > Assert.assertEquals(0, page.getOutlinks().size());
> > Assert.assertEquals(0, page.getParsedContent().size());
> > if(page.getContent() != null) {
> > System.out.println("url:" + page.getUrl().toString());
> > System.out.println( "limit:" + page.getContent().limit());
> > } else {
> > Assert.assertNull(page.getContent());
> > }
> > }
> >
> > //test 6 - delete some with some fields
> > WebPageDataCreator.createWebPageData(store);
> >
> > query = store.newQuery();
> > query.setFields(WebPage.Field.URL.getName());
> > String startKey = SORTED_URLS[NUM_KEYS];
> > String endKey = SORTED_URLS[SORTED_URLS.length - NUM_KEYS];
> > query.setStartKey(startKey);
> > query.setEndKey(endKey);
> >
> > assertNumResults(store.newQuery(), URLS.length);
> > store.deleteByQuery(query);
> > store.deleteByQuery(query);
> > store.deleteByQuery(query);//don't you love that HBase sometimes does
> > not delete arbitrarily
> >
> > store.flush();
> >
> > assertNumResults(store.newQuery(), URLS.length);
> >
> > //assert that data is deleted
> > for (int i = 0; i < URLS.length; i++) {
> > WebPage page = store.get(URLS[i]);
> > Assert.assertNotNull(page);
> > * if( URLS[i].compareTo(startKey) < 0 || URLS[i].compareTo(endKey) >
> 0)
> > { *
> > //not deleted
> > assertWebPage(page, i);
> > } else {
> > //deleted
> > Assert.assertNull(page.getUrl());
> > Assert.assertNotNull(page.getOutlinks());
> > Assert.assertNotNull(page.getParsedContent());
> > Assert.assertNotNull(page.getContent());
> > Assert.assertTrue(page.getOutlinks().size() > 0);
> > Assert.assertTrue(page.getParsedContent().size() > 0);
> > }
> > }
> >
> >
> >
> > --
> > Lewis
>
--
*Lewis*
Re: How does HBase treat end keys?
Posted by Lewis John Mcgibbney <le...@gmail.com>.
Thank you Jean-Daniel, great help.
Regards
Lewis
On Mon, Jan 9, 2012 at 8:19 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
> From Scan's javadoc:
>
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setStopRow(byte[])
>
> stopRow - row to end at (exclusive)
>
> Hope this helps,
>
> J-D
>
> On Mon, Jan 9, 2012 at 12:14 PM, Lewis John Mcgibbney
> <le...@gmail.com> wrote:
> > Hi,
> >
> > Whilst working on some tests for Apache Gora, we've discovered a problem
> > with one of them. The following test [1], which I have also pasted below
> > (I've made the area if code we are concerned with *bold* to try and point
> > it out clearly), expects the last key in a range that was deleted to be
> > present. The developer that reported the issue believes that the end key
> in
> > a query should be inclusive, but our test treats as exclusive. Having
> > searched the mailing lists [2] I am still not 100% certain what HBase's
> > behaviour is... I wonder if someone can clarify this for me and I can
> make
> > the commit accordingly to get the test working properly.
> >
> > Thank you very much in advance for any information/direction on this one.
> >
> > Kind Regards
> >
> > Lewis
> >
> > [1]
> >
> http://svn.apache.org/viewvc/incubator/gora/trunk/gora-core/src/test/java/org/apache/gora/store/DataStoreTestUtil.java?view=markup
> > [2]
> >
> http://article.gmane.org/gmane.comp.java.hadoop.hbase.user/7017/match=endkey+exclusive
> > ----------------------------------------------
> >
> > public static void testDeleteByQueryFields(DataStore<String, WebPage>
> > store)
> > throws IOException {
> >
> > Query<String, WebPage> query;
> >
> > //test 5 - delete all with some fields
> > WebPageDataCreator.createWebPageData(store);
> >
> > query = store.newQuery();
> > query.setFields(WebPage.Field.OUTLINKS.getName()
> > , WebPage.Field.PARSED_CONTENT.getName(),
> > WebPage.Field.CONTENT.getName());
> >
> > assertNumResults(store.newQuery(), URLS.length);
> > store.deleteByQuery(query);
> > store.deleteByQuery(query);
> > store.deleteByQuery(query);//don't you love that HBase sometimes does
> > not delete arbitrarily
> >
> > store.flush();
> >
> > assertNumResults(store.newQuery(), URLS.length);
> >
> > //assert that data is deleted
> > for (int i = 0; i < SORTED_URLS.length; i++) {
> > WebPage page = store.get(SORTED_URLS[i]);
> > Assert.assertNotNull(page);
> >
> > Assert.assertNotNull(page.getUrl());
> > Assert.assertEquals(page.getUrl().toString(), SORTED_URLS[i]);
> > Assert.assertEquals(0, page.getOutlinks().size());
> > Assert.assertEquals(0, page.getParsedContent().size());
> > if(page.getContent() != null) {
> > System.out.println("url:" + page.getUrl().toString());
> > System.out.println( "limit:" + page.getContent().limit());
> > } else {
> > Assert.assertNull(page.getContent());
> > }
> > }
> >
> > //test 6 - delete some with some fields
> > WebPageDataCreator.createWebPageData(store);
> >
> > query = store.newQuery();
> > query.setFields(WebPage.Field.URL.getName());
> > String startKey = SORTED_URLS[NUM_KEYS];
> > String endKey = SORTED_URLS[SORTED_URLS.length - NUM_KEYS];
> > query.setStartKey(startKey);
> > query.setEndKey(endKey);
> >
> > assertNumResults(store.newQuery(), URLS.length);
> > store.deleteByQuery(query);
> > store.deleteByQuery(query);
> > store.deleteByQuery(query);//don't you love that HBase sometimes does
> > not delete arbitrarily
> >
> > store.flush();
> >
> > assertNumResults(store.newQuery(), URLS.length);
> >
> > //assert that data is deleted
> > for (int i = 0; i < URLS.length; i++) {
> > WebPage page = store.get(URLS[i]);
> > Assert.assertNotNull(page);
> > * if( URLS[i].compareTo(startKey) < 0 || URLS[i].compareTo(endKey) >
> 0)
> > { *
> > //not deleted
> > assertWebPage(page, i);
> > } else {
> > //deleted
> > Assert.assertNull(page.getUrl());
> > Assert.assertNotNull(page.getOutlinks());
> > Assert.assertNotNull(page.getParsedContent());
> > Assert.assertNotNull(page.getContent());
> > Assert.assertTrue(page.getOutlinks().size() > 0);
> > Assert.assertTrue(page.getParsedContent().size() > 0);
> > }
> > }
> >
> >
> >
> > --
> > Lewis
>
--
*Lewis*
Re: How does HBase treat end keys?
Posted by Jean-Daniel Cryans <jd...@apache.org>.
>From Scan's javadoc:
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setStopRow(byte[])
stopRow - row to end at (exclusive)
Hope this helps,
J-D
On Mon, Jan 9, 2012 at 12:14 PM, Lewis John Mcgibbney
<le...@gmail.com> wrote:
> Hi,
>
> Whilst working on some tests for Apache Gora, we've discovered a problem
> with one of them. The following test [1], which I have also pasted below
> (I've made the area if code we are concerned with *bold* to try and point
> it out clearly), expects the last key in a range that was deleted to be
> present. The developer that reported the issue believes that the end key in
> a query should be inclusive, but our test treats as exclusive. Having
> searched the mailing lists [2] I am still not 100% certain what HBase's
> behaviour is... I wonder if someone can clarify this for me and I can make
> the commit accordingly to get the test working properly.
>
> Thank you very much in advance for any information/direction on this one.
>
> Kind Regards
>
> Lewis
>
> [1]
> http://svn.apache.org/viewvc/incubator/gora/trunk/gora-core/src/test/java/org/apache/gora/store/DataStoreTestUtil.java?view=markup
> [2]
> http://article.gmane.org/gmane.comp.java.hadoop.hbase.user/7017/match=endkey+exclusive
> ----------------------------------------------
>
> public static void testDeleteByQueryFields(DataStore<String, WebPage>
> store)
> throws IOException {
>
> Query<String, WebPage> query;
>
> //test 5 - delete all with some fields
> WebPageDataCreator.createWebPageData(store);
>
> query = store.newQuery();
> query.setFields(WebPage.Field.OUTLINKS.getName()
> , WebPage.Field.PARSED_CONTENT.getName(),
> WebPage.Field.CONTENT.getName());
>
> assertNumResults(store.newQuery(), URLS.length);
> store.deleteByQuery(query);
> store.deleteByQuery(query);
> store.deleteByQuery(query);//don't you love that HBase sometimes does
> not delete arbitrarily
>
> store.flush();
>
> assertNumResults(store.newQuery(), URLS.length);
>
> //assert that data is deleted
> for (int i = 0; i < SORTED_URLS.length; i++) {
> WebPage page = store.get(SORTED_URLS[i]);
> Assert.assertNotNull(page);
>
> Assert.assertNotNull(page.getUrl());
> Assert.assertEquals(page.getUrl().toString(), SORTED_URLS[i]);
> Assert.assertEquals(0, page.getOutlinks().size());
> Assert.assertEquals(0, page.getParsedContent().size());
> if(page.getContent() != null) {
> System.out.println("url:" + page.getUrl().toString());
> System.out.println( "limit:" + page.getContent().limit());
> } else {
> Assert.assertNull(page.getContent());
> }
> }
>
> //test 6 - delete some with some fields
> WebPageDataCreator.createWebPageData(store);
>
> query = store.newQuery();
> query.setFields(WebPage.Field.URL.getName());
> String startKey = SORTED_URLS[NUM_KEYS];
> String endKey = SORTED_URLS[SORTED_URLS.length - NUM_KEYS];
> query.setStartKey(startKey);
> query.setEndKey(endKey);
>
> assertNumResults(store.newQuery(), URLS.length);
> store.deleteByQuery(query);
> store.deleteByQuery(query);
> store.deleteByQuery(query);//don't you love that HBase sometimes does
> not delete arbitrarily
>
> store.flush();
>
> assertNumResults(store.newQuery(), URLS.length);
>
> //assert that data is deleted
> for (int i = 0; i < URLS.length; i++) {
> WebPage page = store.get(URLS[i]);
> Assert.assertNotNull(page);
> * if( URLS[i].compareTo(startKey) < 0 || URLS[i].compareTo(endKey) > 0)
> { *
> //not deleted
> assertWebPage(page, i);
> } else {
> //deleted
> Assert.assertNull(page.getUrl());
> Assert.assertNotNull(page.getOutlinks());
> Assert.assertNotNull(page.getParsedContent());
> Assert.assertNotNull(page.getContent());
> Assert.assertTrue(page.getOutlinks().size() > 0);
> Assert.assertTrue(page.getParsedContent().size() > 0);
> }
> }
>
>
>
> --
> Lewis