You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@accumulo.apache.org by Keith Massey <ke...@digitalreasoning.com> on 2011/10/26 21:58:18 UTC

Scanning for rows using columnfamily only

In our system we have certain queries where we search for all values in 
a given column family. We have a large table that has several billion 
rows in it spread across 30 nodes. A large percentage of the rows 
contain the column family I am searching for (a little less than half I 
think), but a lot do not. I am trying to pull back only the data in this 
given column family in fairly small batches. So far I've tried using 
Scanner.fetchColumnFamily(), and it takes sometimes a minute or more for 
the first batch to come back. For example:

Scanner scanner = factory.getConnector().createScanner(tableName, new 
Authorizations());
scanner.setBatchSize(100);
scanner.fetchColumnFamily(new Text(columnFamily));
Iterator<Map.Entry<Key, Value>> iterator = scanner.iterator();
while (iterator.hasNext()) { ... // takes 1 minute for this first call 
to hasNext() to return

Similarly I've tried "scan -c MyColumnFamily" from the shell, and it 
takes a minute or so to fetch the first batch. Is there a better way to 
do this? Or if I want to be able do something like this should I create 
a lot of separate tables rather than one giant table? Scans of whole 
tables actually seem to return the first batch really quickly.
Thanks in advance for any help.

Keith

Re: Scanning for rows using columnfamily only

Posted by Keith Turner <ke...@deenlo.com>.

On Tue, Nov 1, 2011 at 4:11 PM, Keith Massey
<ke...@digitalreasoning.com> wrote:
> Thanks for the tips. We tried using one locality group per column family (I
> think there are 20-25). It has definitely sped up queries for all data in a
> single column family. The first batch comes back in about 5 seconds rather
> than 120 seconds without the locality groups. Our data load time doubled
> though from 7 hours to 14 hours. I don't have any evidence at this point
> that it is related to the locality groups. But there were very few
> differences between the 7-hour load and the 14-hour load. Any thoughts about
> whether this could be a side effect of loading data into 25 locality groups?
> Or am I looking in the wrong place?
> Thanks again.
>
> Keith

One experiment worth trying may be to put multiple column families in
a single locality group.  For example, how does putting two column
families in each locality group affect ingest performance and scan
performance.  Then try 4,8, etc.

I may try to run some experiments w/ this also.
> On 10/26/11 6:48 PM, Keith Turner wrote:
>>
>> A few things to consider w/ these options.
>>
>> On Wed, Oct 26, 2011 at 4:13 PM, Adam Fuchs<ad...@ugov.gov>  wrote:
>>>
>>> Hi Keith,
>>> Sounds like you could use some locality groups! By default, Accumulo
>>> stores
>>
>> Consider the number of locality groups.  For example if you created
>> 100 locality groups, then reading all of them is like reading from 100
>> separate sections of a file at the same time and merging.  This could
>> cause a lot of seeking.  You would read all of them when you do not
>> fetch columns on the scanner.  Having a lot of locality groups may not
>> be a problem, if you always fetch columns.  I have not tested w/ a
>> large number of locality groups.
>>
>>> Another trick you can try is using a BatchScanner instead of a Scanner to
>>> read from multiple nodes in parallel. The tradeoff here is you get better
>>> query latency, but your key/value pairs are likely to come back out of
>>> sorted order. This section of the user manual describes the BatchScanner:
>>>
>>> http://incubator.apache.org/accumulo/user_manual_1.3-incubating/Writing_Accumulo_Clients.html#SECTION00520000000000000000
>>>
>> Using batch scanner option will parallelize the filtering of data on
>> tablet server side.  So it may be faster, but a lot more work is being
>> done.  This may be be a good option if you do not use locality groups
>> and do not need to run lots of them concurrently.  Lots of concurrent
>> batch scanner could slow down query performance.  For example creating
>> 100 batch scanners w/ 20 threads each will attempt to start 2000
>> threads on the tablet servers to filter data.  Locality groups would
>> be better for lots of concurrent scans.  The accumulo shell use the
>> batch scanner to implement grep.  A use case for the batch scanner
>> that is better for concurrent batch scanners is doing lots of small
>> lookups.
>>
>> Keith
>

Re: Scanning for rows using columnfamily only

Posted by Keith Turner <ke...@deenlo.com>.

On Wed, Nov 2, 2011 at 4:28 PM, Keith Massey
<ke...@digitalreasoning.com> wrote:
> On 11/1/11 3:11 PM, Keith Massey wrote:
>>
>> Thanks for the tips. We tried using one locality group per column family
>> (I think there are 20-25). It has definitely sped up queries for all
>> data in a single column family. The first batch comes back in about 5
>> seconds rather than 120 seconds without the locality groups. Our data
>> load time doubled though from 7 hours to 14 hours. I don't have any
>> evidence at this point that it is related to the locality groups. But
>> there were very few differences between the 7-hour load and the 14-hour
>> load. Any thoughts about whether this could be a side effect of loading
>> data into 25 locality groups? Or am I looking in the wrong place?
>> Thanks again.
>>
>> Keith
>
> Actually I might have spoken too soon. While many queries now come back in
> around 5 seconds that previously took more than 100, some still take a
> really long time. Specifically they seem to be queries for two column
> families that only appear in about 50 rows total (across billions in the
> table). I've lumped these two metadata-type column families into a single
> locality group. I've confirmed that they are recognized as being in a
> locality group. But if I "scan -c
> <column_family_that_is_in_this_locality_group>" in cloudbase shell, it takes
> hundreds of seconds to return all < 50 rows. Was this a bad use of locality
> groups? Should we just put this metadata into its own table? Thanks again.
>
> Keith
>

One other thing to consider.  Since the in memory map is not
partitioned by locality group, everything in memory would need to be
scanned for this case.  You can look at the monitor page and see if
the table has entries in memory.  If so, you can flush the table from
the shell.  When the entries in memory goes to zero on the monitor
page, try the scan again.

Re: Scanning for rows using columnfamily only

Posted by Keith Turner <ke...@deenlo.com>.

On Wed, Nov 2, 2011 at 4:28 PM, Keith Massey
<ke...@digitalreasoning.com> wrote:
> On 11/1/11 3:11 PM, Keith Massey wrote:
>>
>> Thanks for the tips. We tried using one locality group per column family
>> (I think there are 20-25). It has definitely sped up queries for all
>> data in a single column family. The first batch comes back in about 5
>> seconds rather than 120 seconds without the locality groups. Our data
>> load time doubled though from 7 hours to 14 hours. I don't have any
>> evidence at this point that it is related to the locality groups. But
>> there were very few differences between the 7-hour load and the 14-hour
>> load. Any thoughts about whether this could be a side effect of loading
>> data into 25 locality groups? Or am I looking in the wrong place?
>> Thanks again.
>>
>> Keith
>
> Actually I might have spoken too soon. While many queries now come back in
> around 5 seconds that previously took more than 100, some still take a
> really long time. Specifically they seem to be queries for two column
> families that only appear in about 50 rows total (across billions in the
> table). I've lumped these two metadata-type column families into a single
> locality group. I've confirmed that they are recognized as being in a
> locality group. But if I "scan -c
> <column_family_that_is_in_this_locality_group>" in cloudbase shell, it takes
> hundreds of seconds to return all < 50 rows. Was this a bad use of locality
> groups? Should we just put this metadata into its own table? Thanks again.
>
> Keith
>

If you are scanning the entire table, the scanner still needs to go to
each tablet.  On each tablet it may open files, look at file metadata,
and determine nothing is there.  The regular scanner will go through
the tablets sequentially.  The batch scanner would parallelize this.

Enabling the index cache for the table and adjusting the index cache
size may help the file metadata operations on each tablet go faster.
In 1.4 we enabled the index cache for all tables by default.

How many tablets do you have?

Keith

Re: Scanning for rows using columnfamily only

Posted by Keith Massey <ke...@digitalreasoning.com>.

On 11/1/11 3:11 PM, Keith Massey wrote:
> Thanks for the tips. We tried using one locality group per column family
> (I think there are 20-25). It has definitely sped up queries for all
> data in a single column family. The first batch comes back in about 5
> seconds rather than 120 seconds without the locality groups. Our data
> load time doubled though from 7 hours to 14 hours. I don't have any
> evidence at this point that it is related to the locality groups. But
> there were very few differences between the 7-hour load and the 14-hour
> load. Any thoughts about whether this could be a side effect of loading
> data into 25 locality groups? Or am I looking in the wrong place?
> Thanks again.
>
> Keith
Actually I might have spoken too soon. While many queries now come back 
in around 5 seconds that previously took more than 100, some still take 
a really long time. Specifically they seem to be queries for two column 
families that only appear in about 50 rows total (across billions in the 
table). I've lumped these two metadata-type column families into a 
single locality group. I've confirmed that they are recognized as being 
in a locality group. But if I "scan -c 
<column_family_that_is_in_this_locality_group>" in cloudbase shell, it 
takes hundreds of seconds to return all < 50 rows. Was this a bad use of 
locality groups? Should we just put this metadata into its own table? 
Thanks again.

Keith

Re: Scanning for rows using columnfamily only

Posted by Keith Turner <ke...@deenlo.com>.

On Tue, Nov 1, 2011 at 4:11 PM, Keith Massey
<ke...@digitalreasoning.com> wrote:
> Thanks for the tips. We tried using one locality group per column family (I
> think there are 20-25). It has definitely sped up queries for all data in a
> single column family. The first batch comes back in about 5 seconds rather
> than 120 seconds without the locality groups. Our data load time doubled
> though from 7 hours to 14 hours. I don't have any evidence at this point
> that it is related to the locality groups. But there were very few
> differences between the 7-hour load and the 14-hour load. Any thoughts about
> whether this could be a side effect of loading data into 25 locality groups?
> Or am I looking in the wrong place?
> Thanks again.
>
> Keith
>

I ran some experiments w/ different numbers of locality groups, it had
a noticeable effect on minor compactions times.  The results are in a
comment in ticket ACCUMULO-112. I suspect the locality group change is
behind the slowdown in ingest.

https://issues.apache.org/jira/browse/ACCUMULO-112

Re: Scanning for rows using columnfamily only

Posted by Keith Turner <ke...@deenlo.com>.

On Tue, Nov 1, 2011 at 4:11 PM, Keith Massey
<ke...@digitalreasoning.com> wrote:
> Thanks for the tips. We tried using one locality group per column family (I
> think there are 20-25). It has definitely sped up queries for all data in a
> single column family. The first batch comes back in about 5 seconds rather
> than 120 seconds without the locality groups. Our data load time doubled
> though from 7 hours to 14 hours. I don't have any evidence at this point
> that it is related to the locality groups. But there were very few
> differences between the 7-hour load and the 14-hour load. Any thoughts about
> whether this could be a side effect of loading data into 25 locality groups?
> Or am I looking in the wrong place?
> Thanks again.
>
> Keith
>

One issue could be that we do not segment the in memory map according
to locality groups.  This may not be the problem.  When we minor
compact, for each locality group we scan the entire in memory map and
write out the data for that locality group.  We have discussed
segmenting the in memory map per locality group. One drawback we
though of is that it would increase the insert cost in the case when a
mutation spans multiple locality groups.

Re: Scanning for rows using columnfamily only

Posted by Keith Massey <ke...@digitalreasoning.com>.

Thanks for the tips. We tried using one locality group per column family 
(I think there are 20-25). It has definitely sped up queries for all 
data in a single column family. The first batch comes back in about 5 
seconds rather than 120 seconds without the locality groups. Our data 
load time doubled though from 7 hours to 14 hours. I don't have any 
evidence at this point that it is related to the locality groups. But 
there were very few differences between the 7-hour load and the 14-hour 
load. Any thoughts about whether this could be a side effect of loading 
data into 25 locality groups? Or am I looking in the wrong place?
Thanks again.

Keith

On 10/26/11 6:48 PM, Keith Turner wrote:
> A few things to consider w/ these options.
>
> On Wed, Oct 26, 2011 at 4:13 PM, Adam Fuchs<ad...@ugov.gov>  wrote:
>> Hi Keith,
>> Sounds like you could use some locality groups! By default, Accumulo stores
> Consider the number of locality groups.  For example if you created
> 100 locality groups, then reading all of them is like reading from 100
> separate sections of a file at the same time and merging.  This could
> cause a lot of seeking.  You would read all of them when you do not
> fetch columns on the scanner.  Having a lot of locality groups may not
> be a problem, if you always fetch columns.  I have not tested w/ a
> large number of locality groups.
>
>> Another trick you can try is using a BatchScanner instead of a Scanner to
>> read from multiple nodes in parallel. The tradeoff here is you get better
>> query latency, but your key/value pairs are likely to come back out of
>> sorted order. This section of the user manual describes the BatchScanner:
>> http://incubator.apache.org/accumulo/user_manual_1.3-incubating/Writing_Accumulo_Clients.html#SECTION00520000000000000000
>>
> Using batch scanner option will parallelize the filtering of data on
> tablet server side.  So it may be faster, but a lot more work is being
> done.  This may be be a good option if you do not use locality groups
> and do not need to run lots of them concurrently.  Lots of concurrent
> batch scanner could slow down query performance.  For example creating
> 100 batch scanners w/ 20 threads each will attempt to start 2000
> threads on the tablet servers to filter data.  Locality groups would
> be better for lots of concurrent scans.  The accumulo shell use the
> batch scanner to implement grep.  A use case for the batch scanner
> that is better for concurrent batch scanners is doing lots of small
> lookups.
>
> Keith

Re: Scanning for rows using columnfamily only

Posted by Keith Turner <ke...@deenlo.com>.

A few things to consider w/ these options.

On Wed, Oct 26, 2011 at 4:13 PM, Adam Fuchs <ad...@ugov.gov> wrote:
> Hi Keith,
> Sounds like you could use some locality groups! By default, Accumulo stores

Consider the number of locality groups.  For example if you created
100 locality groups, then reading all of them is like reading from 100
separate sections of a file at the same time and merging.  This could
cause a lot of seeking.  You would read all of them when you do not
fetch columns on the scanner.  Having a lot of locality groups may not
be a problem, if you always fetch columns.  I have not tested w/ a
large number of locality groups.

> Another trick you can try is using a BatchScanner instead of a Scanner to
> read from multiple nodes in parallel. The tradeoff here is you get better
> query latency, but your key/value pairs are likely to come back out of
> sorted order. This section of the user manual describes the BatchScanner:
> http://incubator.apache.org/accumulo/user_manual_1.3-incubating/Writing_Accumulo_Clients.html#SECTION00520000000000000000
>

Using batch scanner option will parallelize the filtering of data on
tablet server side.  So it may be faster, but a lot more work is being
done.  This may be be a good option if you do not use locality groups
and do not need to run lots of them concurrently.  Lots of concurrent
batch scanner could slow down query performance.  For example creating
100 batch scanners w/ 20 threads each will attempt to start 2000
threads on the tablet servers to filter data.  Locality groups would
be better for lots of concurrent scans.  The accumulo shell use the
batch scanner to implement grep.  A use case for the batch scanner
that is better for concurrent batch scanners is doing lots of small
lookups.

Keith

Re: Scanning for rows using columnfamily only

Posted by Adam Fuchs <ad...@ugov.gov>.

Hi Keith,

Sounds like you could use some locality groups! By default, Accumulo stores
all of the column families in the same locality group, laying all of the
keys out sequentially. If you set up explicit locality groups, Accumulo will
partition the data stored in its RFiles. This lets you read from a subset of
your column families without ever reading locality groups that you don't
care about off of the disk. The Accumulo user manual has a pretty good
section on this:
http://incubator.apache.org/accumulo/user_manual_1.3-incubating/Table_Configuration.html#SECTION00610000000000000000

Another trick you can try is using a BatchScanner instead of a Scanner to
read from multiple nodes in parallel. The tradeoff here is you get better
query latency, but your key/value pairs are likely to come back out of
sorted order. This section of the user manual describes the BatchScanner:
http://incubator.apache.org/accumulo/user_manual_1.3-incubating/Writing_Accumulo_Clients.html#SECTION00520000000000000000

Cheers,
Adam

On Wed, Oct 26, 2011 at 3:58 PM, Keith Massey <
keith.massey@digitalreasoning.com> wrote:

> In our system we have certain queries where we search for all values in a
> given column family. We have a large table that has several billion rows in
> it spread across 30 nodes. A large percentage of the rows contain the column
> family I am searching for (a little less than half I think), but a lot do
> not. I am trying to pull back only the data in this given column family in
> fairly small batches. So far I've tried using Scanner.fetchColumnFamily(),
> and it takes sometimes a minute or more for the first batch to come back.
> For example:
>
> Scanner scanner = factory.getConnector().**createScanner(tableName, new
> Authorizations());
> scanner.setBatchSize(100);
> scanner.fetchColumnFamily(new Text(columnFamily));
> Iterator<Map.Entry<Key, Value>> iterator = scanner.iterator();
> while (iterator.hasNext()) { ... // takes 1 minute for this first call to
> hasNext() to return
>
> Similarly I've tried "scan -c MyColumnFamily" from the shell, and it takes
> a minute or so to fetch the first batch. Is there a better way to do this?
> Or if I want to be able do something like this should I create a lot of
> separate tables rather than one giant table? Scans of whole tables actually
> seem to return the first batch really quickly.
> Thanks in advance for any help.
>
> Keith
>
>