You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by kt...@apache.org on 2012/01/10 20:52:48 UTC

svn commit: r1229703 - in /incubator/accumulo/trunk: ./ docs/examples/README.bloom src/core/ src/server/

Author: kturner
Date: Tue Jan 10 19:52:47 2012
New Revision: 1229703

URL: http://svn.apache.org/viewvc?rev=1229703&view=rev
Log:
ACCUMULO-251 added commands to create multipile file bloom table, added info about purpose of flush, showed how to get info about files in a table to README.bloom (merged from 1.4)

Modified:
    incubator/accumulo/trunk/   (props changed)
    incubator/accumulo/trunk/docs/examples/README.bloom
    incubator/accumulo/trunk/src/core/   (props changed)
    incubator/accumulo/trunk/src/server/   (props changed)

Propchange: incubator/accumulo/trunk/
------------------------------------------------------------------------------
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Jan 10 19:52:47 2012
@@ -1,3 +1,3 @@
 /incubator/accumulo/branches/1.3:1190280,1190413,1190420,1190427,1190500,1195622,1195625,1195629,1195635,1196044,1196054,1196057,1196071-1196072,1196106,1197066,1198935,1199383,1203683,1204625,1205547,1205880,1206169,1208031,1209124,1209526,1209532,1209539,1209541,1209587,1209657,1210518,1210571,1210596,1210598,1213424,1214320,1225006,1227215,1227231,1227611,1228195
 /incubator/accumulo/branches/1.3.5rc:1209938
-/incubator/accumulo/branches/1.4:1201902-1228245,1228308,1229205,1229220,1229248,1229357,1229424,1229427-1229428,1229588,1229651
+/incubator/accumulo/branches/1.4:1201902-1228245,1228308,1229205,1229220,1229248,1229357,1229424,1229427-1229428,1229588,1229651,1229699

Modified: incubator/accumulo/trunk/docs/examples/README.bloom
URL: http://svn.apache.org/viewvc/incubator/accumulo/trunk/docs/examples/README.bloom?rev=1229703&r1=1229702&r2=1229703&view=diff
==============================================================================
--- incubator/accumulo/trunk/docs/examples/README.bloom (original)
+++ incubator/accumulo/trunk/docs/examples/README.bloom Tue Jan 10 19:52:47 2012
@@ -93,8 +93,55 @@ prevent the files from being compacted i
  * Flush the table using the shell
 
 After following the above steps, each table will have a tablet with three map
-files.  Each map file will contain 1 million entries generated with a different
-seed. 
+files.  Flushing the table after each batch of inserts will create a map file.
+Each map file will contain 1 million entries generated with a different seed.
+This is assuming that Accumulo is configured with enough memory to hold 1
+million inserts.  If not, then more map files will be created. 
+
+The commands for creating the first table without bloom filters are below.
+
+    $ ./accumulo shell -u username -p password
+    Shell - Accumulo Interactive Shell
+    - version: 1.4.x-incubating
+    - instance name: instance
+    - instance id: 00000000-0000-0000-0000-000000000000
+    - 
+    - type 'help' for a list of available commands
+    - 
+    username@instance> setauths -u username -s exampleVis
+    username@instance> createtable bloom_test1
+    username@instance bloom_test1> config -t bloom_test1 -s table.compaction.major.ratio=7
+    username@instance bloom_test1> exit
+
+    $ ./bin/accumulo org.apache.accumulo.examples.client.RandomBatchWriter -s 7 instance zookeepers username password bloom_test1 1000000 0 1000000000 50 2000000 60000 3 exampleVis
+    $ ./bin/accumulo shell -u username -p password -e 'flush -t bloom_test1 -w'
+    $ ./bin/accumulo org.apache.accumulo.examples.client.RandomBatchWriter -s 8 instance zookeepers username password bloom_test1 1000000 0 1000000000 50 2000000 60000 3 exampleVis
+    $ ./bin/accumulo shell -u username -p password -e 'flush -t bloom_test1 -w'
+    $ ./bin/accumulo org.apache.accumulo.examples.client.RandomBatchWriter -s 9 instance zookeepers username password bloom_test1 1000000 0 1000000000 50 2000000 60000 3 exampleVis
+    $ ./bin/accumulo shell -u username -p password -e 'flush -t bloom_test1 -w'
+
+The commands for creating the second table with bloom filers are below.
+
+    $ ./accumulo shell -u username -p password
+    Shell - Accumulo Interactive Shell
+    - version: 1.4.x-incubating
+    - instance name: instance
+    - instance id: 00000000-0000-0000-0000-000000000000
+    - 
+    - type 'help' for a list of available commands
+    - 
+    username@instance> setauths -u username -s exampleVis
+    username@instance> createtable bloom_test2
+    username@instance bloom_test2> config -t bloom_test2 -s table.compaction.major.ratio=7
+    username@instance bloom_test2> config -t bloom_test2 -s table.bloom.enabled=true
+    username@instance bloom_test2> exit
+
+    $ ./bin/accumulo org.apache.accumulo.examples.client.RandomBatchWriter -s 7 instance zookeepers username password bloom_test2 1000000 0 1000000000 50 2000000 60000 3 exampleVis
+    $ ./bin/accumulo shell -u username -p password -e 'flush -t bloom_test2 -w'
+    $ ./bin/accumulo org.apache.accumulo.examples.client.RandomBatchWriter -s 8 instance zookeepers username password bloom_test2 1000000 0 1000000000 50 2000000 60000 3 exampleVis
+    $ ./bin/accumulo shell -u username -p password -e 'flush -t bloom_test2 -w'
+    $ ./bin/accumulo org.apache.accumulo.examples.client.RandomBatchWriter -s 9 instance zookeepers username password bloom_test2 1000000 0 1000000000 50 2000000 60000 3 exampleVis
+    $ ./bin/accumulo shell -u username -p password -e 'flush -t bloom_test2 -w'
 
 Below 500 lookups are done against the table without bloom filters using random
 NG seed 7.  Even though only one map file will likely contain entries for this
@@ -119,3 +166,60 @@ map files existed.
     Generating 500 random queries...finished
     101.15 lookups/sec   4.94 secs
     num results : 500
+
+You can verify the table has three files by looking in HDFS.  To look in HDFS
+you will need the table ID, because this is used in HDFS instead of the table
+name.  The following command will show table ids.
+
+    $ ./accumulo shell -u username -p password
+    Shell - Accumulo Interactive Shell
+    - version: 1.4.x-incubating
+    - instance name: instance
+    - instance id: 00000000-0000-0000-0000-000000000000
+    - 
+    - type 'help' for a list of available commands
+    - 
+    username@instance> tables -l
+    !METADATA       =>         !0
+    bloom_test1     =>         o7
+    bloom_test2     =>         o8
+    trace           =>          1
+    username@instance> quit
+
+So the table id for bloom_test2 is o8.  The command below shows what files this
+table has in HDFS.  This assumes Accumulo is at the default location in HDFS. 
+
+    $ hadoop fs -lsr /accumulo/tables/o8
+    drwxr-xr-x   - username supergroup          0 2012-01-10 14:02 /accumulo/tables/o8/default_tablet
+    -rw-r--r--   3 username supergroup   52672650 2012-01-10 14:01 /accumulo/tables/o8/default_tablet/F00000dj.rf
+    -rw-r--r--   3 username supergroup   52436176 2012-01-10 14:01 /accumulo/tables/o8/default_tablet/F00000dk.rf
+    -rw-r--r--   3 username supergroup   52850173 2012-01-10 14:02 /accumulo/tables/o8/default_tablet/F00000dl.rf
+
+Running the PrintInfo command shows that one of the files has a bloom filter
+and its 1.5MB.
+
+    $ ./bin/accumulo org.apache.accumulo.core.file.rfile.PrintInfo /accumulo/tables/o8/default_tablet/F00000dj.rf
+    Locality group         : <DEFAULT>
+	Start block          : 0
+	Num   blocks         : 752
+	Index level 0        : 43,598 bytes  1 blocks
+	First key            : row_0000001169 foo:1 [exampleVis] 1326222052539 false
+	Last key             : row_0999999421 foo:1 [exampleVis] 1326222052058 false
+	Num entries          : 999,536
+	Column families      : [foo]
+
+    Meta block     : BCFile.index
+      Raw size             : 4 bytes
+      Compressed size      : 12 bytes
+      Compression type     : gz
+
+    Meta block     : RFile.index
+      Raw size             : 43,696 bytes
+      Compressed size      : 15,592 bytes
+      Compression type     : gz
+
+    Meta block     : acu_bloom
+      Raw size             : 1,540,292 bytes
+      Compressed size      : 1,433,115 bytes
+      Compression type     : gz
+

Propchange: incubator/accumulo/trunk/src/core/
------------------------------------------------------------------------------
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Jan 10 19:52:47 2012
@@ -1,3 +1,3 @@
 /incubator/accumulo/branches/1.3/src/core:1190280,1190413,1190420,1190427,1190500,1195622,1195625,1195629,1195635,1196044,1196054,1196057,1196071-1196072,1196106,1197066,1198935,1199383,1203683,1204625,1205547,1205880,1206169,1208031,1209124,1209526,1209532,1209539,1209541,1209587,1209657,1210518,1210571,1210596,1210598,1213424,1214320,1225006,1227215
 /incubator/accumulo/branches/1.3.5rc/src/core:1209938
-/incubator/accumulo/branches/1.4/src/core:1201902-1228245,1228308,1229205,1229220,1229248,1229357,1229424,1229427-1229428,1229588,1229651
+/incubator/accumulo/branches/1.4/src/core:1201902-1228245,1228308,1229205,1229220,1229248,1229357,1229424,1229427-1229428,1229588,1229651,1229699

Propchange: incubator/accumulo/trunk/src/server/
------------------------------------------------------------------------------
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Jan 10 19:52:47 2012
@@ -1,3 +1,3 @@
 /incubator/accumulo/branches/1.3/src/server:1190280,1190413,1190420,1190427,1190500,1195622,1195625,1195629,1195635,1196044,1196054,1196057,1196071-1196072,1196106,1197066,1198935,1199383,1203683,1204625,1205547,1205880,1206169,1208031,1209124,1209526,1209532,1209539,1209541,1209587,1209657,1210518,1210571,1210596,1210598,1213424,1214320,1225006,1227215,1227231,1227611
 /incubator/accumulo/branches/1.3.5rc/src/server:1209938
-/incubator/accumulo/branches/1.4/src/server:1201902-1228245,1228308,1229205,1229220,1229248,1229357,1229424,1229427-1229428,1229588,1229651
+/incubator/accumulo/branches/1.4/src/server:1201902-1228245,1228308,1229205,1229220,1229248,1229357,1229424,1229427-1229428,1229588,1229651,1229699