You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by br...@apache.org on 2011/04/29 21:28:07 UTC

svn commit: r1097922 - in /cassandra/branches/cassandra-0.7/contrib/pig: README.txt example-script.pig

Author: brandonwilliams
Date: Fri Apr 29 19:28:06 2011
New Revision: 1097922

URL: http://svn.apache.org/viewvc?rev=1097922&view=rev
Log:
Update pig example script to work again.
Patch by Jeremy Hanna, reviewed by brandonwilliams for CASSANDRA-2487

Modified:
    cassandra/branches/cassandra-0.7/contrib/pig/README.txt
    cassandra/branches/cassandra-0.7/contrib/pig/example-script.pig

Modified: cassandra/branches/cassandra-0.7/contrib/pig/README.txt
URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/contrib/pig/README.txt?rev=1097922&r1=1097921&r2=1097922&view=diff
==============================================================================
--- cassandra/branches/cassandra-0.7/contrib/pig/README.txt (original)
+++ cassandra/branches/cassandra-0.7/contrib/pig/README.txt Fri Apr 29 19:28:06 2011
@@ -18,17 +18,22 @@ also set PIG_CONF_DIR to the location of
 
 Finally, set the following as environment variables (uppercase,
 underscored), or as Hadoop configuration variables (lowercase, dotted):
-* PIG_RPC_PORT or cassandra.thrift.port : the port thrift is listening on 
 * PIG_INITIAL_ADDRESS or cassandra.thrift.address : initial address to connect to
+* PIG_RPC_PORT or cassandra.thrift.port : the port thrift is listening on
 * PIG_PARTITIONER or cassandra.partitioner.class : cluster partitioner
 
-Run:
+For example, against a local node with the default settings, you'd use:
+export PIG_INITIAL_ADDRESS=localhost
+export PIG_RPC_PORT=9160
+export PIG_PARTITIONER=org.apache.cassandra.dht.RandomPartitioner
+
+Then you can build and run it like this:
 
 contrib/pig$ ant
 contrib/pig$ bin/pig_cassandra -x local example-script.pig
 
 This will run the test script against your Cassandra instance
-and will assume that there is a Keyspace1/Standard1 with some
+and will assume that there is a MyKeyspace/MyColumnFamily with some
 data in it. It will run in local mode (see pig docs for more info).
 
 If you'd like to get to a 'grunt>' shell prompt, run:
@@ -38,24 +43,24 @@ contrib/pig$ bin/pig_cassandra -x local
 Once the 'grunt>' shell has loaded, try a simple program like the
 following, which will determine the top 50 column names:
 
-grunt> rows = LOAD 'cassandra://Keyspace1/Standard1' USING CassandraStorage();
-grunt> cols = FOREACH rows GENERATE flatten($1);
+grunt> rows = LOAD 'cassandra://MyKeyspace/MyColumnFamily' USING CassandraStorage() AS (key, columns: bag {T: tuple(name, value)});
+grunt> cols = FOREACH rows GENERATE flatten(columns);
 grunt> colnames = FOREACH cols GENERATE $0;
-grunt> namegroups = GROUP colnames BY $0;
+grunt> namegroups = GROUP colnames BY (chararray) $0;
 grunt> namecounts = FOREACH namegroups GENERATE COUNT($1), group;
 grunt> orderednames = ORDER namecounts BY $0;
 grunt> topnames = LIMIT orderednames 50;
 grunt> dump topnames;
 
 Slices on columns can also be specified:
-grunt> rows = LOAD 'cassandra://Keyspace1/Standard1&slice_start=C2&slice_end=C4&i&limit=1&reversed=true' USING CassandraStorage();
+grunt> rows = LOAD 'cassandra://MyKeyspace/MyColumnFamily&slice_start=C2&slice_end=C4&i&limit=1&reversed=true' USING CassandraStorage() AS (key, columns: bag {T: tuple(name, value)});
 
 Binary values for slice_start and slice_end can be escaped such as '\u0255'
 
 Outputting to Cassandra requires the same format from input, so the simplest example is:
 
-grunt> rows = LOAD 'cassandra://Keyspace1/Standard1' USING CassandraStorage();
-grunt> STORE rows into 'cassandra://Keyspace1/Standard2' USING CassandraStorage();
+grunt> rows = LOAD 'cassandra://MyKeyspace/MyColumnFamily' USING CassandraStorage();
+grunt> STORE rows into 'cassandra://MyKeyspace/MyColumnFamily' USING CassandraStorage();
 
 Which will copy the ColumnFamily.  Note that the destination ColumnFamily must
 already exist for this to work.

Modified: cassandra/branches/cassandra-0.7/contrib/pig/example-script.pig
URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/contrib/pig/example-script.pig?rev=1097922&r1=1097921&r2=1097922&view=diff
==============================================================================
--- cassandra/branches/cassandra-0.7/contrib/pig/example-script.pig (original)
+++ cassandra/branches/cassandra-0.7/contrib/pig/example-script.pig Fri Apr 29 19:28:06 2011
@@ -1,7 +1,7 @@
-rows = LOAD 'cassandra://Keyspace1/Standard1' USING CassandraStorage();
-cols = FOREACH rows GENERATE flatten($1);
+rows = LOAD 'cassandra://MyKeyspace/MyColumnFamily' USING CassandraStorage() AS (key, columns: bag {T: tuple(name, value)});
+cols = FOREACH rows GENERATE flatten(columns);
 colnames = FOREACH cols GENERATE $0;
-namegroups = GROUP colnames BY $0;
+namegroups = GROUP colnames BY (chararray) $0;
 namecounts = FOREACH namegroups GENERATE COUNT($1), group;
 orderednames = ORDER namecounts BY $0;
 topnames = LIMIT orderednames 50;