You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Lewis John Mcgibbney <le...@gmail.com> on 2012/01/09 21:14:53 UTC

How does HBase treat end keys?

Hi,

Whilst working on some tests for Apache Gora, we've discovered a problem
with one of them. The following test [1], which I have also pasted below
(I've made the area if code we are concerned with *bold* to try and point
it out clearly), expects the last key in a range that was deleted to be
present. The developer that reported the issue believes that the end key in
a query should be inclusive, but our test treats as exclusive. Having
searched the mailing lists [2] I am still not 100% certain what HBase's
behaviour is... I wonder if someone can clarify this for me and I can make
the commit accordingly to get the test working properly.

Thank you very much in advance for any information/direction on this one.

Kind Regards

Lewis

[1]
http://svn.apache.org/viewvc/incubator/gora/trunk/gora-core/src/test/java/org/apache/gora/store/DataStoreTestUtil.java?view=markup
[2]
http://article.gmane.org/gmane.comp.java.hadoop.hbase.user/7017/match=endkey+exclusive
----------------------------------------------

 public static void testDeleteByQueryFields(DataStore<String, WebPage>
store)
 throws IOException {

   Query<String, WebPage> query;

   //test 5 - delete all with some fields
   WebPageDataCreator.createWebPageData(store);

   query = store.newQuery();
   query.setFields(WebPage.Field.OUTLINKS.getName()
       , WebPage.Field.PARSED_CONTENT.getName(),
WebPage.Field.CONTENT.getName());

   assertNumResults(store.newQuery(), URLS.length);
   store.deleteByQuery(query);
   store.deleteByQuery(query);
   store.deleteByQuery(query);//don't you love that HBase sometimes does
not delete arbitrarily

   store.flush();

   assertNumResults(store.newQuery(), URLS.length);

   //assert that data is deleted
   for (int i = 0; i < SORTED_URLS.length; i++) {
     WebPage page = store.get(SORTED_URLS[i]);
     Assert.assertNotNull(page);

     Assert.assertNotNull(page.getUrl());
     Assert.assertEquals(page.getUrl().toString(), SORTED_URLS[i]);
     Assert.assertEquals(0, page.getOutlinks().size());
     Assert.assertEquals(0, page.getParsedContent().size());
     if(page.getContent() != null) {
       System.out.println("url:" + page.getUrl().toString());
       System.out.println( "limit:" + page.getContent().limit());
     } else {
       Assert.assertNull(page.getContent());
     }
   }

   //test 6 - delete some with some fields
   WebPageDataCreator.createWebPageData(store);

   query = store.newQuery();
   query.setFields(WebPage.Field.URL.getName());
   String startKey = SORTED_URLS[NUM_KEYS];
   String endKey = SORTED_URLS[SORTED_URLS.length - NUM_KEYS];
   query.setStartKey(startKey);
   query.setEndKey(endKey);

   assertNumResults(store.newQuery(), URLS.length);
   store.deleteByQuery(query);
   store.deleteByQuery(query);
   store.deleteByQuery(query);//don't you love that HBase sometimes does
not delete arbitrarily

   store.flush();

   assertNumResults(store.newQuery(), URLS.length);

   //assert that data is deleted
   for (int i = 0; i < URLS.length; i++) {
     WebPage page = store.get(URLS[i]);
     Assert.assertNotNull(page);
*     if( URLS[i].compareTo(startKey) < 0 || URLS[i].compareTo(endKey) > 0)
{ *
       //not deleted
       assertWebPage(page, i);
     } else {
       //deleted
       Assert.assertNull(page.getUrl());
       Assert.assertNotNull(page.getOutlinks());
       Assert.assertNotNull(page.getParsedContent());
       Assert.assertNotNull(page.getContent());
       Assert.assertTrue(page.getOutlinks().size() > 0);
       Assert.assertTrue(page.getParsedContent().size() > 0);
     }
   }



-- 
Lewis

Re: How does HBase treat end keys?

Posted by lars hofhansl <lh...@yahoo.com>.
If you needed to make it inclusive you can add a trailing 0 byte to the byte[] passed to setStopRow.

-- Lars



________________________________
 From: Lewis John Mcgibbney <le...@gmail.com>
To: user@hbase.apache.org 
Sent: Monday, January 9, 2012 12:46 PM
Subject: Re: How does HBase treat end keys?
 
Thank you Jean-Daniel, great help.

Regards

Lewis

On Mon, Jan 9, 2012 at 8:19 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> From Scan's javadoc:
>
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setStopRow(byte[])
>
> stopRow - row to end at (exclusive)
>
> Hope this helps,
>
> J-D
>
> On Mon, Jan 9, 2012 at 12:14 PM, Lewis John Mcgibbney
> <le...@gmail.com> wrote:
> > Hi,
> >
> > Whilst working on some tests for Apache Gora, we've discovered a problem
> > with one of them. The following test [1], which I have also pasted below
> > (I've made the area if code we are concerned with *bold* to try and point
> > it out clearly), expects the last key in a range that was deleted to be
> > present. The developer that reported the issue believes that the end key
> in
> > a query should be inclusive, but our test treats as exclusive. Having
> > searched the mailing lists [2] I am still not 100% certain what HBase's
> > behaviour is... I wonder if someone can clarify this for me and I can
> make
> > the commit accordingly to get the test working properly.
> >
> > Thank you very much in advance for any information/direction on this one.
> >
> > Kind Regards
> >
> > Lewis
> >
> > [1]
> >
> http://svn.apache.org/viewvc/incubator/gora/trunk/gora-core/src/test/java/org/apache/gora/store/DataStoreTestUtil.java?view=markup
> > [2]
> >
> http://article.gmane.org/gmane.comp.java.hadoop.hbase.user/7017/match=endkey+exclusive
> > ----------------------------------------------
> >
> >  public static void testDeleteByQueryFields(DataStore<String, WebPage>
> > store)
> >  throws IOException {
> >
> >   Query<String, WebPage> query;
> >
> >   //test 5 - delete all with some fields
> >   WebPageDataCreator.createWebPageData(store);
> >
> >   query = store.newQuery();
> >   query.setFields(WebPage.Field.OUTLINKS.getName()
> >       , WebPage.Field.PARSED_CONTENT.getName(),
> > WebPage.Field.CONTENT.getName());
> >
> >   assertNumResults(store.newQuery(), URLS.length);
> >   store.deleteByQuery(query);
> >   store.deleteByQuery(query);
> >   store.deleteByQuery(query);//don't you love that HBase sometimes does
> > not delete arbitrarily
> >
> >   store.flush();
> >
> >   assertNumResults(store.newQuery(), URLS.length);
> >
> >   //assert that data is deleted
> >   for (int i = 0; i < SORTED_URLS.length; i++) {
> >     WebPage page = store.get(SORTED_URLS[i]);
> >     Assert.assertNotNull(page);
> >
> >     Assert.assertNotNull(page.getUrl());
> >     Assert.assertEquals(page.getUrl().toString(), SORTED_URLS[i]);
> >     Assert.assertEquals(0, page.getOutlinks().size());
> >     Assert.assertEquals(0, page.getParsedContent().size());
> >     if(page.getContent() != null) {
> >       System.out.println("url:" + page.getUrl().toString());
> >       System.out.println( "limit:" + page.getContent().limit());
> >     } else {
> >       Assert.assertNull(page.getContent());
> >     }
> >   }
> >
> >   //test 6 - delete some with some fields
> >   WebPageDataCreator.createWebPageData(store);
> >
> >   query = store.newQuery();
> >   query.setFields(WebPage.Field.URL.getName());
> >   String startKey = SORTED_URLS[NUM_KEYS];
> >   String endKey = SORTED_URLS[SORTED_URLS.length - NUM_KEYS];
> >   query.setStartKey(startKey);
> >   query.setEndKey(endKey);
> >
> >   assertNumResults(store.newQuery(), URLS.length);
> >   store.deleteByQuery(query);
> >   store.deleteByQuery(query);
> >   store.deleteByQuery(query);//don't you love that HBase sometimes does
> > not delete arbitrarily
> >
> >   store.flush();
> >
> >   assertNumResults(store.newQuery(), URLS.length);
> >
> >   //assert that data is deleted
> >   for (int i = 0; i < URLS.length; i++) {
> >     WebPage page = store.get(URLS[i]);
> >     Assert.assertNotNull(page);
> > *     if( URLS[i].compareTo(startKey) < 0 || URLS[i].compareTo(endKey) >
> 0)
> > { *
> >       //not deleted
> >       assertWebPage(page, i);
> >     } else {
> >       //deleted
> >       Assert.assertNull(page.getUrl());
> >       Assert.assertNotNull(page.getOutlinks());
> >       Assert.assertNotNull(page.getParsedContent());
> >       Assert.assertNotNull(page.getContent());
> >       Assert.assertTrue(page.getOutlinks().size() > 0);
> >       Assert.assertTrue(page.getParsedContent().size() > 0);
> >     }
> >   }
> >
> >
> >
> > --
> > Lewis
>



-- 
*Lewis*

Re: How does HBase treat end keys?

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Thank you Jean-Daniel, great help.

Regards

Lewis

On Mon, Jan 9, 2012 at 8:19 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> From Scan's javadoc:
>
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setStopRow(byte[])
>
> stopRow - row to end at (exclusive)
>
> Hope this helps,
>
> J-D
>
> On Mon, Jan 9, 2012 at 12:14 PM, Lewis John Mcgibbney
> <le...@gmail.com> wrote:
> > Hi,
> >
> > Whilst working on some tests for Apache Gora, we've discovered a problem
> > with one of them. The following test [1], which I have also pasted below
> > (I've made the area if code we are concerned with *bold* to try and point
> > it out clearly), expects the last key in a range that was deleted to be
> > present. The developer that reported the issue believes that the end key
> in
> > a query should be inclusive, but our test treats as exclusive. Having
> > searched the mailing lists [2] I am still not 100% certain what HBase's
> > behaviour is... I wonder if someone can clarify this for me and I can
> make
> > the commit accordingly to get the test working properly.
> >
> > Thank you very much in advance for any information/direction on this one.
> >
> > Kind Regards
> >
> > Lewis
> >
> > [1]
> >
> http://svn.apache.org/viewvc/incubator/gora/trunk/gora-core/src/test/java/org/apache/gora/store/DataStoreTestUtil.java?view=markup
> > [2]
> >
> http://article.gmane.org/gmane.comp.java.hadoop.hbase.user/7017/match=endkey+exclusive
> > ----------------------------------------------
> >
> >  public static void testDeleteByQueryFields(DataStore<String, WebPage>
> > store)
> >  throws IOException {
> >
> >   Query<String, WebPage> query;
> >
> >   //test 5 - delete all with some fields
> >   WebPageDataCreator.createWebPageData(store);
> >
> >   query = store.newQuery();
> >   query.setFields(WebPage.Field.OUTLINKS.getName()
> >       , WebPage.Field.PARSED_CONTENT.getName(),
> > WebPage.Field.CONTENT.getName());
> >
> >   assertNumResults(store.newQuery(), URLS.length);
> >   store.deleteByQuery(query);
> >   store.deleteByQuery(query);
> >   store.deleteByQuery(query);//don't you love that HBase sometimes does
> > not delete arbitrarily
> >
> >   store.flush();
> >
> >   assertNumResults(store.newQuery(), URLS.length);
> >
> >   //assert that data is deleted
> >   for (int i = 0; i < SORTED_URLS.length; i++) {
> >     WebPage page = store.get(SORTED_URLS[i]);
> >     Assert.assertNotNull(page);
> >
> >     Assert.assertNotNull(page.getUrl());
> >     Assert.assertEquals(page.getUrl().toString(), SORTED_URLS[i]);
> >     Assert.assertEquals(0, page.getOutlinks().size());
> >     Assert.assertEquals(0, page.getParsedContent().size());
> >     if(page.getContent() != null) {
> >       System.out.println("url:" + page.getUrl().toString());
> >       System.out.println( "limit:" + page.getContent().limit());
> >     } else {
> >       Assert.assertNull(page.getContent());
> >     }
> >   }
> >
> >   //test 6 - delete some with some fields
> >   WebPageDataCreator.createWebPageData(store);
> >
> >   query = store.newQuery();
> >   query.setFields(WebPage.Field.URL.getName());
> >   String startKey = SORTED_URLS[NUM_KEYS];
> >   String endKey = SORTED_URLS[SORTED_URLS.length - NUM_KEYS];
> >   query.setStartKey(startKey);
> >   query.setEndKey(endKey);
> >
> >   assertNumResults(store.newQuery(), URLS.length);
> >   store.deleteByQuery(query);
> >   store.deleteByQuery(query);
> >   store.deleteByQuery(query);//don't you love that HBase sometimes does
> > not delete arbitrarily
> >
> >   store.flush();
> >
> >   assertNumResults(store.newQuery(), URLS.length);
> >
> >   //assert that data is deleted
> >   for (int i = 0; i < URLS.length; i++) {
> >     WebPage page = store.get(URLS[i]);
> >     Assert.assertNotNull(page);
> > *     if( URLS[i].compareTo(startKey) < 0 || URLS[i].compareTo(endKey) >
> 0)
> > { *
> >       //not deleted
> >       assertWebPage(page, i);
> >     } else {
> >       //deleted
> >       Assert.assertNull(page.getUrl());
> >       Assert.assertNotNull(page.getOutlinks());
> >       Assert.assertNotNull(page.getParsedContent());
> >       Assert.assertNotNull(page.getContent());
> >       Assert.assertTrue(page.getOutlinks().size() > 0);
> >       Assert.assertTrue(page.getParsedContent().size() > 0);
> >     }
> >   }
> >
> >
> >
> > --
> > Lewis
>



-- 
*Lewis*

Re: How does HBase treat end keys?

Posted by Jean-Daniel Cryans <jd...@apache.org>.
>From Scan's javadoc:
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setStopRow(byte[])

stopRow - row to end at (exclusive)

Hope this helps,

J-D

On Mon, Jan 9, 2012 at 12:14 PM, Lewis John Mcgibbney
<le...@gmail.com> wrote:
> Hi,
>
> Whilst working on some tests for Apache Gora, we've discovered a problem
> with one of them. The following test [1], which I have also pasted below
> (I've made the area if code we are concerned with *bold* to try and point
> it out clearly), expects the last key in a range that was deleted to be
> present. The developer that reported the issue believes that the end key in
> a query should be inclusive, but our test treats as exclusive. Having
> searched the mailing lists [2] I am still not 100% certain what HBase's
> behaviour is... I wonder if someone can clarify this for me and I can make
> the commit accordingly to get the test working properly.
>
> Thank you very much in advance for any information/direction on this one.
>
> Kind Regards
>
> Lewis
>
> [1]
> http://svn.apache.org/viewvc/incubator/gora/trunk/gora-core/src/test/java/org/apache/gora/store/DataStoreTestUtil.java?view=markup
> [2]
> http://article.gmane.org/gmane.comp.java.hadoop.hbase.user/7017/match=endkey+exclusive
> ----------------------------------------------
>
>  public static void testDeleteByQueryFields(DataStore<String, WebPage>
> store)
>  throws IOException {
>
>   Query<String, WebPage> query;
>
>   //test 5 - delete all with some fields
>   WebPageDataCreator.createWebPageData(store);
>
>   query = store.newQuery();
>   query.setFields(WebPage.Field.OUTLINKS.getName()
>       , WebPage.Field.PARSED_CONTENT.getName(),
> WebPage.Field.CONTENT.getName());
>
>   assertNumResults(store.newQuery(), URLS.length);
>   store.deleteByQuery(query);
>   store.deleteByQuery(query);
>   store.deleteByQuery(query);//don't you love that HBase sometimes does
> not delete arbitrarily
>
>   store.flush();
>
>   assertNumResults(store.newQuery(), URLS.length);
>
>   //assert that data is deleted
>   for (int i = 0; i < SORTED_URLS.length; i++) {
>     WebPage page = store.get(SORTED_URLS[i]);
>     Assert.assertNotNull(page);
>
>     Assert.assertNotNull(page.getUrl());
>     Assert.assertEquals(page.getUrl().toString(), SORTED_URLS[i]);
>     Assert.assertEquals(0, page.getOutlinks().size());
>     Assert.assertEquals(0, page.getParsedContent().size());
>     if(page.getContent() != null) {
>       System.out.println("url:" + page.getUrl().toString());
>       System.out.println( "limit:" + page.getContent().limit());
>     } else {
>       Assert.assertNull(page.getContent());
>     }
>   }
>
>   //test 6 - delete some with some fields
>   WebPageDataCreator.createWebPageData(store);
>
>   query = store.newQuery();
>   query.setFields(WebPage.Field.URL.getName());
>   String startKey = SORTED_URLS[NUM_KEYS];
>   String endKey = SORTED_URLS[SORTED_URLS.length - NUM_KEYS];
>   query.setStartKey(startKey);
>   query.setEndKey(endKey);
>
>   assertNumResults(store.newQuery(), URLS.length);
>   store.deleteByQuery(query);
>   store.deleteByQuery(query);
>   store.deleteByQuery(query);//don't you love that HBase sometimes does
> not delete arbitrarily
>
>   store.flush();
>
>   assertNumResults(store.newQuery(), URLS.length);
>
>   //assert that data is deleted
>   for (int i = 0; i < URLS.length; i++) {
>     WebPage page = store.get(URLS[i]);
>     Assert.assertNotNull(page);
> *     if( URLS[i].compareTo(startKey) < 0 || URLS[i].compareTo(endKey) > 0)
> { *
>       //not deleted
>       assertWebPage(page, i);
>     } else {
>       //deleted
>       Assert.assertNull(page.getUrl());
>       Assert.assertNotNull(page.getOutlinks());
>       Assert.assertNotNull(page.getParsedContent());
>       Assert.assertNotNull(page.getContent());
>       Assert.assertTrue(page.getOutlinks().size() > 0);
>       Assert.assertTrue(page.getParsedContent().size() > 0);
>     }
>   }
>
>
>
> --
> Lewis