You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by kt...@apache.org on 2012/01/10 20:52:48 UTC
svn commit: r1229703 - in /incubator/accumulo/trunk: ./
docs/examples/README.bloom src/core/ src/server/
Author: kturner
Date: Tue Jan 10 19:52:47 2012
New Revision: 1229703
URL: http://svn.apache.org/viewvc?rev=1229703&view=rev
Log:
ACCUMULO-251 added commands to create multipile file bloom table, added info about purpose of flush, showed how to get info about files in a table to README.bloom (merged from 1.4)
Modified:
incubator/accumulo/trunk/ (props changed)
incubator/accumulo/trunk/docs/examples/README.bloom
incubator/accumulo/trunk/src/core/ (props changed)
incubator/accumulo/trunk/src/server/ (props changed)
Propchange: incubator/accumulo/trunk/
------------------------------------------------------------------------------
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Jan 10 19:52:47 2012
@@ -1,3 +1,3 @@
/incubator/accumulo/branches/1.3:1190280,1190413,1190420,1190427,1190500,1195622,1195625,1195629,1195635,1196044,1196054,1196057,1196071-1196072,1196106,1197066,1198935,1199383,1203683,1204625,1205547,1205880,1206169,1208031,1209124,1209526,1209532,1209539,1209541,1209587,1209657,1210518,1210571,1210596,1210598,1213424,1214320,1225006,1227215,1227231,1227611,1228195
/incubator/accumulo/branches/1.3.5rc:1209938
-/incubator/accumulo/branches/1.4:1201902-1228245,1228308,1229205,1229220,1229248,1229357,1229424,1229427-1229428,1229588,1229651
+/incubator/accumulo/branches/1.4:1201902-1228245,1228308,1229205,1229220,1229248,1229357,1229424,1229427-1229428,1229588,1229651,1229699
Modified: incubator/accumulo/trunk/docs/examples/README.bloom
URL: http://svn.apache.org/viewvc/incubator/accumulo/trunk/docs/examples/README.bloom?rev=1229703&r1=1229702&r2=1229703&view=diff
==============================================================================
--- incubator/accumulo/trunk/docs/examples/README.bloom (original)
+++ incubator/accumulo/trunk/docs/examples/README.bloom Tue Jan 10 19:52:47 2012
@@ -93,8 +93,55 @@ prevent the files from being compacted i
* Flush the table using the shell
After following the above steps, each table will have a tablet with three map
-files. Each map file will contain 1 million entries generated with a different
-seed.
+files. Flushing the table after each batch of inserts will create a map file.
+Each map file will contain 1 million entries generated with a different seed.
+This is assuming that Accumulo is configured with enough memory to hold 1
+million inserts. If not, then more map files will be created.
+
+The commands for creating the first table without bloom filters are below.
+
+ $ ./accumulo shell -u username -p password
+ Shell - Accumulo Interactive Shell
+ - version: 1.4.x-incubating
+ - instance name: instance
+ - instance id: 00000000-0000-0000-0000-000000000000
+ -
+ - type 'help' for a list of available commands
+ -
+ username@instance> setauths -u username -s exampleVis
+ username@instance> createtable bloom_test1
+ username@instance bloom_test1> config -t bloom_test1 -s table.compaction.major.ratio=7
+ username@instance bloom_test1> exit
+
+ $ ./bin/accumulo org.apache.accumulo.examples.client.RandomBatchWriter -s 7 instance zookeepers username password bloom_test1 1000000 0 1000000000 50 2000000 60000 3 exampleVis
+ $ ./bin/accumulo shell -u username -p password -e 'flush -t bloom_test1 -w'
+ $ ./bin/accumulo org.apache.accumulo.examples.client.RandomBatchWriter -s 8 instance zookeepers username password bloom_test1 1000000 0 1000000000 50 2000000 60000 3 exampleVis
+ $ ./bin/accumulo shell -u username -p password -e 'flush -t bloom_test1 -w'
+ $ ./bin/accumulo org.apache.accumulo.examples.client.RandomBatchWriter -s 9 instance zookeepers username password bloom_test1 1000000 0 1000000000 50 2000000 60000 3 exampleVis
+ $ ./bin/accumulo shell -u username -p password -e 'flush -t bloom_test1 -w'
+
+The commands for creating the second table with bloom filers are below.
+
+ $ ./accumulo shell -u username -p password
+ Shell - Accumulo Interactive Shell
+ - version: 1.4.x-incubating
+ - instance name: instance
+ - instance id: 00000000-0000-0000-0000-000000000000
+ -
+ - type 'help' for a list of available commands
+ -
+ username@instance> setauths -u username -s exampleVis
+ username@instance> createtable bloom_test2
+ username@instance bloom_test2> config -t bloom_test2 -s table.compaction.major.ratio=7
+ username@instance bloom_test2> config -t bloom_test2 -s table.bloom.enabled=true
+ username@instance bloom_test2> exit
+
+ $ ./bin/accumulo org.apache.accumulo.examples.client.RandomBatchWriter -s 7 instance zookeepers username password bloom_test2 1000000 0 1000000000 50 2000000 60000 3 exampleVis
+ $ ./bin/accumulo shell -u username -p password -e 'flush -t bloom_test2 -w'
+ $ ./bin/accumulo org.apache.accumulo.examples.client.RandomBatchWriter -s 8 instance zookeepers username password bloom_test2 1000000 0 1000000000 50 2000000 60000 3 exampleVis
+ $ ./bin/accumulo shell -u username -p password -e 'flush -t bloom_test2 -w'
+ $ ./bin/accumulo org.apache.accumulo.examples.client.RandomBatchWriter -s 9 instance zookeepers username password bloom_test2 1000000 0 1000000000 50 2000000 60000 3 exampleVis
+ $ ./bin/accumulo shell -u username -p password -e 'flush -t bloom_test2 -w'
Below 500 lookups are done against the table without bloom filters using random
NG seed 7. Even though only one map file will likely contain entries for this
@@ -119,3 +166,60 @@ map files existed.
Generating 500 random queries...finished
101.15 lookups/sec 4.94 secs
num results : 500
+
+You can verify the table has three files by looking in HDFS. To look in HDFS
+you will need the table ID, because this is used in HDFS instead of the table
+name. The following command will show table ids.
+
+ $ ./accumulo shell -u username -p password
+ Shell - Accumulo Interactive Shell
+ - version: 1.4.x-incubating
+ - instance name: instance
+ - instance id: 00000000-0000-0000-0000-000000000000
+ -
+ - type 'help' for a list of available commands
+ -
+ username@instance> tables -l
+ !METADATA => !0
+ bloom_test1 => o7
+ bloom_test2 => o8
+ trace => 1
+ username@instance> quit
+
+So the table id for bloom_test2 is o8. The command below shows what files this
+table has in HDFS. This assumes Accumulo is at the default location in HDFS.
+
+ $ hadoop fs -lsr /accumulo/tables/o8
+ drwxr-xr-x - username supergroup 0 2012-01-10 14:02 /accumulo/tables/o8/default_tablet
+ -rw-r--r-- 3 username supergroup 52672650 2012-01-10 14:01 /accumulo/tables/o8/default_tablet/F00000dj.rf
+ -rw-r--r-- 3 username supergroup 52436176 2012-01-10 14:01 /accumulo/tables/o8/default_tablet/F00000dk.rf
+ -rw-r--r-- 3 username supergroup 52850173 2012-01-10 14:02 /accumulo/tables/o8/default_tablet/F00000dl.rf
+
+Running the PrintInfo command shows that one of the files has a bloom filter
+and its 1.5MB.
+
+ $ ./bin/accumulo org.apache.accumulo.core.file.rfile.PrintInfo /accumulo/tables/o8/default_tablet/F00000dj.rf
+ Locality group : <DEFAULT>
+ Start block : 0
+ Num blocks : 752
+ Index level 0 : 43,598 bytes 1 blocks
+ First key : row_0000001169 foo:1 [exampleVis] 1326222052539 false
+ Last key : row_0999999421 foo:1 [exampleVis] 1326222052058 false
+ Num entries : 999,536
+ Column families : [foo]
+
+ Meta block : BCFile.index
+ Raw size : 4 bytes
+ Compressed size : 12 bytes
+ Compression type : gz
+
+ Meta block : RFile.index
+ Raw size : 43,696 bytes
+ Compressed size : 15,592 bytes
+ Compression type : gz
+
+ Meta block : acu_bloom
+ Raw size : 1,540,292 bytes
+ Compressed size : 1,433,115 bytes
+ Compression type : gz
+
Propchange: incubator/accumulo/trunk/src/core/
------------------------------------------------------------------------------
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Jan 10 19:52:47 2012
@@ -1,3 +1,3 @@
/incubator/accumulo/branches/1.3/src/core:1190280,1190413,1190420,1190427,1190500,1195622,1195625,1195629,1195635,1196044,1196054,1196057,1196071-1196072,1196106,1197066,1198935,1199383,1203683,1204625,1205547,1205880,1206169,1208031,1209124,1209526,1209532,1209539,1209541,1209587,1209657,1210518,1210571,1210596,1210598,1213424,1214320,1225006,1227215
/incubator/accumulo/branches/1.3.5rc/src/core:1209938
-/incubator/accumulo/branches/1.4/src/core:1201902-1228245,1228308,1229205,1229220,1229248,1229357,1229424,1229427-1229428,1229588,1229651
+/incubator/accumulo/branches/1.4/src/core:1201902-1228245,1228308,1229205,1229220,1229248,1229357,1229424,1229427-1229428,1229588,1229651,1229699
Propchange: incubator/accumulo/trunk/src/server/
------------------------------------------------------------------------------
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Jan 10 19:52:47 2012
@@ -1,3 +1,3 @@
/incubator/accumulo/branches/1.3/src/server:1190280,1190413,1190420,1190427,1190500,1195622,1195625,1195629,1195635,1196044,1196054,1196057,1196071-1196072,1196106,1197066,1198935,1199383,1203683,1204625,1205547,1205880,1206169,1208031,1209124,1209526,1209532,1209539,1209541,1209587,1209657,1210518,1210571,1210596,1210598,1213424,1214320,1225006,1227215,1227231,1227611
/incubator/accumulo/branches/1.3.5rc/src/server:1209938
-/incubator/accumulo/branches/1.4/src/server:1201902-1228245,1228308,1229205,1229220,1229248,1229357,1229424,1229427-1229428,1229588,1229651
+/incubator/accumulo/branches/1.4/src/server:1201902-1228245,1228308,1229205,1229220,1229248,1229357,1229424,1229427-1229428,1229588,1229651,1229699