You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Jean-Daniel Cryans <jd...@apache.org> on 2011/08/30 02:23:14 UTC

Re: HBase Scan returns fewer columns after a few minutes of insertion

(sending to user@ and bbcing dev@ since this is a user question)

That type of problem can be "fun" to debug, did you try with the shell
to query the data? Do you get a different result?

BTW, any TTL set on that table?

J-D

On Mon, Aug 29, 2011 at 5:09 PM, Neerja Bhatnagar <ne...@gmail.com> wrote:
> Hi,
>
> I am sorry if this question has been resolved before. Thank you for your
> help.
>
> I am seeing really strange behavior with HBase Scan.
>
> I insert 1 row into a table named test, 1 col family named testColFam, and 3
> columns : foo (with value foo), bar (with value bar), and id (a unique id).
>
> I wait 5 minutes, and run the following code to retrieve the row ---
>
> HTablePool htablePool = new HTablePool(config, maxsize);
>
> HTable table = (HTable) htablePool.getTable("test"); // test is the
> tablename
>
> Scan scan = new Scan();
> scan.addFamily(Bytes.toBytes("testColFam"));
> scan.setStartRow(Bytes.toBytes("")); // scan from the first row
> scan.setBatch(batchSize);
>
> ResultScanner resScanner = table.getScanner(scan);
> Iterator<Result> resultIterator = resultScanner.iterator();
>
> Result result = resultIterator.next();
>
> result.getMap();
>
> the result.getMap() behaves differently based on time-elapsed. If I run this
> code as soon as I have inserted the data, the 3 columns in the 1 row are
> returned as expected.
>
> But after some time elapses, scan returns fewer columns per row each time.
>
> Can anyone please help me with this? Please let me know if you need more
> information.
>
> Do I need to set the timerange or something to make sure that all columns
> are returned?
>
> Cheers, Neerja
>

Re: HBase Scan returns fewer columns after a few minutes of insertion

Posted by Jean-Daniel Cryans <jd...@apache.org>.
If you want to limit the number of rows you can instead set the
caching to exactly what you need, or set a stop row.

J-D

On Mon, Aug 29, 2011 at 11:38 PM, Neerja Bhatnagar <ne...@gmail.com> wrote:
> Hi J-D,
>
> Thank you very much! Hopefully, this iteration clears it up for me.
> The batchSize is set to 1. I tried the same code with batchSize set to
> nothing or the same number as the number of columns in my column family.
> When the batchsize is not set, or is set to the number of columns in the
> column family I am retrieving from getMap or getFamilyMap, then the entire
> result (as expected) is returned.
>
> Is batchsize setting the number of columns to return, rather than number of
> rows?
> I am sorry,  to me it is not clear if the API is for setBatch in Scan is
> row-oriented or column-oriented.
>
> Perhaps, I should use the PageFilter to limit the number of rows retrieved
> from HBase?
> setBatch
>
> public void *setBatch*(int batch)
>
> Set the maximum number of values to return for each call to next()
>
> *Parameters:*batch - the maximum number of valuesYour help is much
> appreciated. Cheers, Neerja
>
> On Mon, Aug 29, 2011 at 7:07 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> (Sending to user@ again and bccing dev@ for the last time, please take
>> notice and reply to user@)
>>
>> Ok so it should be something about the code... what is batchSize set
>> to? I don't see it in that code snippet.
>>
>> getMap gives a map of all the families with all the data, whereas
>> getFamily gives a map of all the qualifiers and their values for one
>> family. Both APIs are good, just solving a different problem.
>>
>> J-D
>>
>> On Mon, Aug 29, 2011 at 6:00 PM, Neerja Bhatnagar <ne...@gmail.com>
>> wrote:
>> > Hi J-D,
>> >
>> > Thanks! I do scan 'tablename' on the shell, and I can see all 3 columns
>> in
>> > the 1 column family for a row. I haven't set any TTL on the table or
>> result
>> > scanner.
>> > Any other suggestions would be very welcome. I was getting the same
>> response
>> > with result.getFamilyMap() and I moved to result.getMap() thinking I was
>> > using the wrong api.
>> >
>> > Cheers, Neerja
>> >
>> > On Mon, Aug 29, 2011 at 5:23 PM, Jean-Daniel Cryans <jdcryans@apache.org
>> >wrote:
>> >
>> >> (sending to user@ and bbcing dev@ since this is a user question)
>> >>
>> >> That type of problem can be "fun" to debug, did you try with the shell
>> >> to query the data? Do you get a different result?
>> >>
>> >> BTW, any TTL set on that table?
>> >>
>> >> J-D
>> >>
>>
>

Re: HBase Scan returns fewer columns after a few minutes of insertion

Posted by Neerja Bhatnagar <ne...@gmail.com>.
Hi J-D,

Thank you very much! Hopefully, this iteration clears it up for me.
The batchSize is set to 1. I tried the same code with batchSize set to
nothing or the same number as the number of columns in my column family.
When the batchsize is not set, or is set to the number of columns in the
column family I am retrieving from getMap or getFamilyMap, then the entire
result (as expected) is returned.

Is batchsize setting the number of columns to return, rather than number of
rows?
I am sorry,  to me it is not clear if the API is for setBatch in Scan is
row-oriented or column-oriented.

Perhaps, I should use the PageFilter to limit the number of rows retrieved
from HBase?
setBatch

public void *setBatch*(int batch)

Set the maximum number of values to return for each call to next()

*Parameters:*batch - the maximum number of valuesYour help is much
appreciated. Cheers, Neerja

On Mon, Aug 29, 2011 at 7:07 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> (Sending to user@ again and bccing dev@ for the last time, please take
> notice and reply to user@)
>
> Ok so it should be something about the code... what is batchSize set
> to? I don't see it in that code snippet.
>
> getMap gives a map of all the families with all the data, whereas
> getFamily gives a map of all the qualifiers and their values for one
> family. Both APIs are good, just solving a different problem.
>
> J-D
>
> On Mon, Aug 29, 2011 at 6:00 PM, Neerja Bhatnagar <ne...@gmail.com>
> wrote:
> > Hi J-D,
> >
> > Thanks! I do scan 'tablename' on the shell, and I can see all 3 columns
> in
> > the 1 column family for a row. I haven't set any TTL on the table or
> result
> > scanner.
> > Any other suggestions would be very welcome. I was getting the same
> response
> > with result.getFamilyMap() and I moved to result.getMap() thinking I was
> > using the wrong api.
> >
> > Cheers, Neerja
> >
> > On Mon, Aug 29, 2011 at 5:23 PM, Jean-Daniel Cryans <jdcryans@apache.org
> >wrote:
> >
> >> (sending to user@ and bbcing dev@ since this is a user question)
> >>
> >> That type of problem can be "fun" to debug, did you try with the shell
> >> to query the data? Do you get a different result?
> >>
> >> BTW, any TTL set on that table?
> >>
> >> J-D
> >>
>

Re: HBase Scan returns fewer columns after a few minutes of insertion

Posted by Jean-Daniel Cryans <jd...@apache.org>.
(Sending to user@ again and bccing dev@ for the last time, please take
notice and reply to user@)

Ok so it should be something about the code... what is batchSize set
to? I don't see it in that code snippet.

getMap gives a map of all the families with all the data, whereas
getFamily gives a map of all the qualifiers and their values for one
family. Both APIs are good, just solving a different problem.

J-D

On Mon, Aug 29, 2011 at 6:00 PM, Neerja Bhatnagar <ne...@gmail.com> wrote:
> Hi J-D,
>
> Thanks! I do scan 'tablename' on the shell, and I can see all 3 columns in
> the 1 column family for a row. I haven't set any TTL on the table or result
> scanner.
> Any other suggestions would be very welcome. I was getting the same response
> with result.getFamilyMap() and I moved to result.getMap() thinking I was
> using the wrong api.
>
> Cheers, Neerja
>
> On Mon, Aug 29, 2011 at 5:23 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> (sending to user@ and bbcing dev@ since this is a user question)
>>
>> That type of problem can be "fun" to debug, did you try with the shell
>> to query the data? Do you get a different result?
>>
>> BTW, any TTL set on that table?
>>
>> J-D
>>

Re: HBase Scan returns fewer columns after a few minutes of insertion

Posted by Jean-Daniel Cryans <jd...@apache.org>.
(Sending to user@ again and bccing dev@ for the last time, please take
notice and reply to user@)

Ok so it should be something about the code... what is batchSize set
to? I don't see it in that code snippet.

getMap gives a map of all the families with all the data, whereas
getFamily gives a map of all the qualifiers and their values for one
family. Both APIs are good, just solving a different problem.

J-D

On Mon, Aug 29, 2011 at 6:00 PM, Neerja Bhatnagar <ne...@gmail.com> wrote:
> Hi J-D,
>
> Thanks! I do scan 'tablename' on the shell, and I can see all 3 columns in
> the 1 column family for a row. I haven't set any TTL on the table or result
> scanner.
> Any other suggestions would be very welcome. I was getting the same response
> with result.getFamilyMap() and I moved to result.getMap() thinking I was
> using the wrong api.
>
> Cheers, Neerja
>
> On Mon, Aug 29, 2011 at 5:23 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> (sending to user@ and bbcing dev@ since this is a user question)
>>
>> That type of problem can be "fun" to debug, did you try with the shell
>> to query the data? Do you get a different result?
>>
>> BTW, any TTL set on that table?
>>
>> J-D
>>

Re: HBase Scan returns fewer columns after a few minutes of insertion

Posted by Neerja Bhatnagar <ne...@gmail.com>.
Hi J-D,

Thanks! I do scan 'tablename' on the shell, and I can see all 3 columns in
the 1 column family for a row. I haven't set any TTL on the table or result
scanner.
Any other suggestions would be very welcome. I was getting the same response
with result.getFamilyMap() and I moved to result.getMap() thinking I was
using the wrong api.

Cheers, Neerja

On Mon, Aug 29, 2011 at 5:23 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> (sending to user@ and bbcing dev@ since this is a user question)
>
> That type of problem can be "fun" to debug, did you try with the shell
> to query the data? Do you get a different result?
>
> BTW, any TTL set on that table?
>
> J-D
>
> On Mon, Aug 29, 2011 at 5:09 PM, Neerja Bhatnagar <ne...@gmail.com>
> wrote:
> > Hi,
> >
> > I am sorry if this question has been resolved before. Thank you for your
> > help.
> >
> > I am seeing really strange behavior with HBase Scan.
> >
> > I insert 1 row into a table named test, 1 col family named testColFam,
> and 3
> > columns : foo (with value foo), bar (with value bar), and id (a unique
> id).
> >
> > I wait 5 minutes, and run the following code to retrieve the row ---
> >
> > HTablePool htablePool = new HTablePool(config, maxsize);
> >
> > HTable table = (HTable) htablePool.getTable("test"); // test is the
> > tablename
> >
> > Scan scan = new Scan();
> > scan.addFamily(Bytes.toBytes("testColFam"));
> > scan.setStartRow(Bytes.toBytes("")); // scan from the first row
> > scan.setBatch(batchSize);
> >
> > ResultScanner resScanner = table.getScanner(scan);
> > Iterator<Result> resultIterator = resultScanner.iterator();
> >
> > Result result = resultIterator.next();
> >
> > result.getMap();
> >
> > the result.getMap() behaves differently based on time-elapsed. If I run
> this
> > code as soon as I have inserted the data, the 3 columns in the 1 row are
> > returned as expected.
> >
> > But after some time elapses, scan returns fewer columns per row each
> time.
> >
> > Can anyone please help me with this? Please let me know if you need more
> > information.
> >
> > Do I need to set the timerange or something to make sure that all columns
> > are returned?
> >
> > Cheers, Neerja
> >
>