You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Vincent BARAT (JIRA)" <ji...@apache.org> on 2011/07/28 10:42:09 UTC

[jira] [Created] (PIG-2193) Problem with HBase loader 0.90.3 and PIG 0.8.1

Problem with HBase loader 0.90.3 and PIG 0.8.1
----------------------------------------------

                 Key: PIG-2193
                 URL: https://issues.apache.org/jira/browse/PIG-2193
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.8.1
         Environment: HBase 0.90.3, Hadoop 0.20-append
            Reporter: Vincent BARAT


I've some data in HBase 0.90.3 and I run a simple script on them.

This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).

When data are loaded from HDFS instead of HBase, the script runs perfectly.

Here is the script loading from HDFS (works): 

start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
sessions = JOIN start_sessions BY sid, end_sessions BY sid;
sessions = FILTER sessions BY end > start AND end - start < 86400000L;
sessions = JOIN sessions BY infoid, infos BY infoid;
sessions = LIMIT sessions 100;
dump sessions;

The same script loading from HBase (don't work):

start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
sessions = JOIN start_sessions BY sid, end_sessions BY sid;
sessions = FILTER sessions BY end > start AND end - start < 86400000L;
sessions = JOIN sessions BY infoid, infos BY infoid;
sessions = LIMIT sessions 100;
dump sessions;

I guess it definitively means there is a nasty bug in the HBase loader.

Here is the PIG dump for the HBase version:

aws09:~# pig
2011-07-28 08:17:36,329 [main] INFO  org.apache.pig.Main - Logging error messages to: /root/pig_1311841056328.log
2011-07-28 08:17:36,641 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://aws09.preprod.ubithere.com:9000
2011-07-28 08:17:36,923 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: aws09.preprod.ubithere.com:9001
grunt> start_sessions = LOAD 'startSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
grunt> end_sessions = LOAD 'endSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
grunt> infos = LOAD 'info.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
grunt> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
grunt> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
grunt> sessions = JOIN sessions BY infoid, infos BY infoid;
grunt> sessions = LIMIT sessions 100;
grunt> dump sessions;
2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: HASH_JOIN,FILTER,LIMIT
2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - pig.usenewlogicalplan is set to true. New logical plan will be used.
2011-07-28 08:17:51,213 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: sessions: Store(hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740:org.apache.pig.impl.io.InterStorage) - scope-93 Operator Key: scope-93)
2011-07-28 08:17:51,225 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3
2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 3
2011-07-28 08:17:51,402 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2011-07-28 08:17:51,411 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32 GMT
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:host.name=aws09.machine.com
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.version=1.6.0_22
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Sun Microsystems Inc.
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.22/jre
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=/opt/pig/bin/../conf:/usr/lib/jvm/java-6-sun/jre/lib/tools.jar:/opt/pig/bin/../pig-0.8.1-core.jar:/opt/pig/bin/../build/pig-*-SNAPSHOT.jar:/opt/pig/bin/../lib/commons-el-1.0.jar:/opt/pig/bin/../lib/commons-lang-2.4.jar:/opt/pig/bin/../lib/commons-logging-1.1.1.jar:/opt/pig/bin/../lib/guava-r06.jar:/opt/pig/bin/../lib/hbase-0.90.3.jar:/opt/pig/bin/../lib/hsqldb-1.8.0.10.jar:/opt/pig/bin/../lib/jackson-core-asl-1.0.1.jar:/opt/pig/bin/../lib/jackson-mapper-asl-1.0.1.jar:/opt/pig/bin/../lib/javacc-4.2.jar:/opt/pig/bin/../lib/javacc.jar:/opt/pig/bin/../lib/jetty-util-6.1.14.jar:/opt/pig/bin/../lib/jline-0.9.94.jar:/opt/pig/bin/../lib/joda-time-1.6.jar:/opt/pig/bin/../lib/jsch-0.1.38.jar:/opt/pig/bin/../lib/junit-4.5.jar:/opt/pig/bin/../lib/jython-2.5.0.jar:/opt/pig/bin/../lib/log4j-1.2.14.jar:/opt/pig/bin/../lib/pigudfs.jar:/opt/pig/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/pig/bin/../lib/zookeeper-3.3.2.jar:/opt/hadoop/conf_computation:/opt/hbase/conf:/opt/pig/lib/hadoop-0.20-append-core.jar
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64/server:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/tmp
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=<NA>
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.name=Linux
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.arch=amd64
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.version=2.6.21.7-2.fc8xen-ec2-v1.0
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.name=root
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.home=/root
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/root
2011-07-28 08:17:51,471 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:51,493 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:51,499 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:51,508 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054b, negotiated timeout = 60000
2011-07-28 08:17:51,575 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@ef894ce; hsa=aws03.machine.com:60020
2011-07-28 08:17:51,687 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:51,696 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:51,700 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
2011-07-28 08:17:51,726 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:51,729 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
2011-07-28 08:17:53,328 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=0
2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2011-07-28 08:17:53,442 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2011-07-28 08:17:53,944 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2011-07-28 08:17:53,989 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:53,990 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:53,991 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:53,996 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054c, negotiated timeout = 60000
2011-07-28 08:17:54,000 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2d9f90e3; hsa=aws03.machine.com:60020
2011-07-28 08:17:54,005 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,006 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:54,011 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
2011-07-28 08:17:54,017 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:54,017 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:54,018 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:54,025 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054d, negotiated timeout = 60000
2011-07-28 08:17:54,029 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@41f6321; hsa=aws03.machine.com:60020
2011-07-28 08:17:54,032 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,033 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:54,037 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
2011-07-28 08:17:54,039 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
2011-07-28 08:17:54,067 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
2011-07-28 08:17:54,109 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:54,110 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:54,111 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:54,119 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054e, negotiated timeout = 60000
2011-07-28 08:17:54,123 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@20c3e967; hsa=aws03.machine.com:60020
2011-07-28 08:17:54,140 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,142 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:54,148 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
2011-07-28 08:17:54,154 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:54,158 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:54,159 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:54,161 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054f, negotiated timeout = 60000
2011-07-28 08:17:54,164 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5ee771f3; hsa=aws03.machine.com:60020
2011-07-28 08:17:54,167 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,169 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:54,172 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
2011-07-28 08:17:54,173 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
2011-07-28 08:17:54,180 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0314
2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0314
2011-07-28 08:19:06,924 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 8% complete
2011-07-28 08:19:15,971 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 16% complete
2011-07-28 08:19:18,985 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 19% complete
2011-07-28 08:19:25,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 27% complete
2011-07-28 08:20:14,810 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2011-07-28 08:20:14,812 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-07-28 08:20:14,830 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:20:14,831 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:20:14,832 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:20:14,838 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60556, negotiated timeout = 60000
2011-07-28 08:20:14,842 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5d4fa79d; hsa=aws03.machine.com:60020
2011-07-28 08:20:14,847 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:20:14,849 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:20:14,852 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
2011-07-28 08:20:16,311 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=198330658
2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2011-07-28 08:20:16,341 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2011-07-28 08:20:16,693 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:20:16,694 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:20:16,695 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:20:16,702 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60557, negotiated timeout = 60000
2011-07-28 08:20:16,705 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@37285252; hsa=aws03.machine.com:60020
2011-07-28 08:20:16,709 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:20:16,710 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:20:16,714 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
2011-07-28 08:20:16,716 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:20:16,717 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:20:16,718 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:20:16,720 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60558, negotiated timeout = 60000
2011-07-28 08:20:16,723 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7418e252; hsa=aws03.machine.com:60020
2011-07-28 08:20:16,726 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:20:16,727 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:20:16,730 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
2011-07-28 08:20:16,732 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
2011-07-28 08:20:16,772 [Thread-32] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0315
2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0315
2011-07-28 08:20:28,075 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 37% complete
2011-07-28 08:20:34,106 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 41% complete
2011-07-28 08:20:37,124 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2011-07-28 08:20:46,168 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 51% complete
2011-07-28 08:20:49,183 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 61% complete
2011-07-28 08:20:52,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 62% complete
2011-07-28 08:20:55,214 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 64% complete
2011-07-28 08:21:01,244 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 66% complete
2011-07-28 08:21:07,311 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2011-07-28 08:21:07,312 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-07-28 08:21:08,770 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2011-07-28 08:21:08,778 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2011-07-28 08:21:08,911 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0316
2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0316
2011-07-28 08:21:16,321 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 83% complete
2011-07-28 08:21:34,439 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2011-07-28 08:21:34,441 [main] INFO  org.apache.pig.tools.pigstats.PigStats - Script Statistics:

HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt    Features
0.20-append    0.8.1-SNAPSHOT    root    2011-07-28 08:17:51    2011-07-28 08:21:34    HASH_JOIN,FILTER,LIMIT

Success!

Job Stats (time in seconds):
JobId    Maps    Reduces    MaxMapTime    MinMapTIme    AvgMapTime    MaxReduceTime    MinReduceTime    AvgReduceTime    Alias    Feature    Outputs
job_201107251336_0314    2    1    75    66    70    63    63    63    end_sessions,sessions,start_sessions    HASH_JOIN
job_201107251336_0315    4    1    15    6    12    24    24    24    infos,sessions    HASH_JOIN
job_201107251336_0316    1    1    3    3    3    12    12    12            hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740,

Input(s):
Successfully read 2069446 records from: "endSession.mde253811.preprod.ubithere.com"
Successfully read 2072419 records from: "startSession.mde253811.preprod.ubithere.com"
Successfully read 19441 records from: "info.mde253811.preprod.ubithere.com"

Output(s):
Successfully stored 0 records in: "hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 1
Total records proactively spilled: 1944943

Job DAG:
job_201107251336_0314    ->    job_201107251336_0315,
job_201107251336_0315    ->    job_201107251336_0316,
job_201107251336_0316


2011-07-28 08:21:34,472 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2011-07-28 08:21:34,500 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2011-07-28 08:21:34,501 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
grunt>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (PIG-2193) Problem with HBase loader 0.90.3 and PIG 0.8.1

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated PIG-2193:
------------------------------

    Attachment: PIG-2193.patch

patch for trunk is attached.
  - This is the complete fix.
  - includes a unit test
  - earlier simple patch works, but does not set the project before the scan is initialized.
  - removed 'initialized' variable since we know when schema and projections are set:
    -- store schema is set inside checkSchema()
    -- projection is set in pushProjection()
    -- on the backend, projection is handled in side setLocation().
    -- on the backend store schema is set inside setStoreLocation().


> Problem with HBase loader 0.90.3 and PIG 0.8.1
> ----------------------------------------------
>
>                 Key: PIG-2193
>                 URL: https://issues.apache.org/jira/browse/PIG-2193
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: HBase 0.90.3, Hadoop 0.20-append
>            Reporter: Vincent BARAT
>         Attachments: PIG-2193.patch, PIG-2193.patch
>
>
> I've some data in HBase 0.90.3 and I run a simple script on them.
> This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).
> When data are loaded from HDFS instead of HBase, the script runs perfectly.
> Here is the script loading from HDFS (works): 
> start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> The same script loading from HBase (don't work):
> start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> I guess it definitively means there is a nasty bug in the HBase loader.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2193) Problem with HBase loader 0.90.3 and PIG 0.8.1

Posted by "Vincent BARAT (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072260#comment-13072260 ] 

Vincent BARAT commented on PIG-2193:
------------------------------------

Interesting:

I've reduced my request to:

start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
sessions = JOIN start_sessions BY sid, end_sessions BY sid;
sessions = FILTER sessions BY end > start AND end - start < 86400000L;
dump sessions;

If I dump 'sessions' before the FILTER statement, I correctly values for the 'infoid' and 'locid' field.

{code}
(00000A2A33254B8FAE1E9AEAB2428EBE,b87ac86bcf1d4cb44202aa826554a7b2,4e77d62e1839a470ec8386d42b85a076,1310649832970,00000A2A33254B8FAE1E9AEAB2428EBE,1310649838390,)
(00001DCECDC842C0A745C151B9EC295F,4a4bb0fff26e368c8209f1e480fdf70b,db3924d2e4b88bd103fa19aaa30a9af4,1310628836846,00001DCECDC842C0A745C151B9EC295F,1310628839075,)
(00001F8F2B3148D393963928188C72B6,5d7e58f68366b55d55862815f863a996,79e1ba90aa555e3e1041df4be657a11d,1310681918742,00001F8F2B3148D393963928188C72B6,1310681949182,)
{code}

But if I dump 'sessions' after the FILTER statement, I have no values for the 'infoid' and 'locid' fields.

{code}
(00000A2A33254B8FAE1E9AEAB2428EBE,,,1310649832970,00000A2A33254B8FAE1E9AEAB2428EBE,1310649838390,)
(00001DCECDC842C0A745C151B9EC295F,,,1310628836846,00001DCECDC842C0A745C151B9EC295F,1310628839075,)
(00001F8F2B3148D393963928188C72B6,,,1310681918742,00001F8F2B3148D393963928188C72B6,1310681949182,)
{code}

so the next JOIN return an empty table.

> Problem with HBase loader 0.90.3 and PIG 0.8.1
> ----------------------------------------------
>
>                 Key: PIG-2193
>                 URL: https://issues.apache.org/jira/browse/PIG-2193
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: HBase 0.90.3, Hadoop 0.20-append
>            Reporter: Vincent BARAT
>
> I've some data in HBase 0.90.3 and I run a simple script on them.
> This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).
> When data are loaded from HDFS instead of HBase, the script runs perfectly.
> Here is the script loading from HDFS (works): 
> start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> The same script loading from HBase (don't work):
> start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> I guess it definitively means there is a nasty bug in the HBase loader.
> Here is the PIG dump for the HBase version:
> aws09:~# pig
> 2011-07-28 08:17:36,329 [main] INFO  org.apache.pig.Main - Logging error messages to: /root/pig_1311841056328.log
> 2011-07-28 08:17:36,641 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://aws09.preprod.ubithere.com:9000
> 2011-07-28 08:17:36,923 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: aws09.preprod.ubithere.com:9001
> grunt> start_sessions = LOAD 'startSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> grunt> end_sessions = LOAD 'endSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> grunt> infos = LOAD 'info.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> grunt> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> grunt> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> grunt> sessions = JOIN sessions BY infoid, infos BY infoid;
> grunt> sessions = LIMIT sessions 100;
> grunt> dump sessions;
> 2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: HASH_JOIN,FILTER,LIMIT
> 2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - pig.usenewlogicalplan is set to true. New logical plan will be used.
> 2011-07-28 08:17:51,213 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: sessions: Store(hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740:org.apache.pig.impl.io.InterStorage) - scope-93 Operator Key: scope-93)
> 2011-07-28 08:17:51,225 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
> 2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
> 2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
> 2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3
> 2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 3
> 2011-07-28 08:17:51,402 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:17:51,411 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32 GMT
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:host.name=aws09.machine.com
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.version=1.6.0_22
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Sun Microsystems Inc.
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.22/jre
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=/opt/pig/bin/../conf:/usr/lib/jvm/java-6-sun/jre/lib/tools.jar:/opt/pig/bin/../pig-0.8.1-core.jar:/opt/pig/bin/../build/pig-*-SNAPSHOT.jar:/opt/pig/bin/../lib/commons-el-1.0.jar:/opt/pig/bin/../lib/commons-lang-2.4.jar:/opt/pig/bin/../lib/commons-logging-1.1.1.jar:/opt/pig/bin/../lib/guava-r06.jar:/opt/pig/bin/../lib/hbase-0.90.3.jar:/opt/pig/bin/../lib/hsqldb-1.8.0.10.jar:/opt/pig/bin/../lib/jackson-core-asl-1.0.1.jar:/opt/pig/bin/../lib/jackson-mapper-asl-1.0.1.jar:/opt/pig/bin/../lib/javacc-4.2.jar:/opt/pig/bin/../lib/javacc.jar:/opt/pig/bin/../lib/jetty-util-6.1.14.jar:/opt/pig/bin/../lib/jline-0.9.94.jar:/opt/pig/bin/../lib/joda-time-1.6.jar:/opt/pig/bin/../lib/jsch-0.1.38.jar:/opt/pig/bin/../lib/junit-4.5.jar:/opt/pig/bin/../lib/jython-2.5.0.jar:/opt/pig/bin/../lib/log4j-1.2.14.jar:/opt/pig/bin/../lib/pigudfs.jar:/opt/pig/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/pig/bin/../lib/zookeeper-3.3.2.jar:/opt/hadoop/conf_computation:/opt/hbase/conf:/opt/pig/lib/hadoop-0.20-append-core.jar
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64/server:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/tmp
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=<NA>
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.name=Linux
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.arch=amd64
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.version=2.6.21.7-2.fc8xen-ec2-v1.0
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.name=root
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.home=/root
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/root
> 2011-07-28 08:17:51,471 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:51,493 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:51,499 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:51,508 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054b, negotiated timeout = 60000
> 2011-07-28 08:17:51,575 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@ef894ce; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:51,687 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:51,696 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:51,700 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:51,726 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:51,729 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:53,328 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=0
> 2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
> 2011-07-28 08:17:53,442 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:17:53,944 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
> 2011-07-28 08:17:53,989 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:53,990 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:53,991 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:53,996 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054c, negotiated timeout = 60000
> 2011-07-28 08:17:54,000 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2d9f90e3; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,005 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,006 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,011 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:54,017 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,017 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,018 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,025 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054d, negotiated timeout = 60000
> 2011-07-28 08:17:54,029 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@41f6321; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,032 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,033 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,037 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:54,039 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:17:54,067 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:17:54,109 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,110 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,111 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,119 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054e, negotiated timeout = 60000
> 2011-07-28 08:17:54,123 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@20c3e967; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,140 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,142 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,148 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:54,154 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,158 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,159 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,161 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054f, negotiated timeout = 60000
> 2011-07-28 08:17:54,164 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5ee771f3; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,167 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,169 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,172 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:54,173 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:17:54,180 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0314
> 2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0314
> 2011-07-28 08:19:06,924 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 8% complete
> 2011-07-28 08:19:15,971 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 16% complete
> 2011-07-28 08:19:18,985 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 19% complete
> 2011-07-28 08:19:25,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 27% complete
> 2011-07-28 08:20:14,810 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:20:14,812 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:20:14,830 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:14,831 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:14,832 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:14,838 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60556, negotiated timeout = 60000
> 2011-07-28 08:20:14,842 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5d4fa79d; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:14,847 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:14,849 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:14,852 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,311 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=198330658
> 2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
> 2011-07-28 08:20:16,341 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> 2011-07-28 08:20:16,693 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:16,694 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:16,695 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:16,702 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60557, negotiated timeout = 60000
> 2011-07-28 08:20:16,705 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@37285252; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:16,709 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:16,710 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:16,714 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,716 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:16,717 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:16,718 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:16,720 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60558, negotiated timeout = 60000
> 2011-07-28 08:20:16,723 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7418e252; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:16,726 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:16,727 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:16,730 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,732 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:20:16,772 [Thread-32] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0315
> 2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0315
> 2011-07-28 08:20:28,075 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 37% complete
> 2011-07-28 08:20:34,106 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 41% complete
> 2011-07-28 08:20:37,124 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
> 2011-07-28 08:20:46,168 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 51% complete
> 2011-07-28 08:20:49,183 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 61% complete
> 2011-07-28 08:20:52,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 62% complete
> 2011-07-28 08:20:55,214 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 64% complete
> 2011-07-28 08:21:01,244 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 66% complete
> 2011-07-28 08:21:07,311 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:21:07,312 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:21:08,770 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:21:08,778 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> 2011-07-28 08:21:08,911 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
> 2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0316
> 2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0316
> 2011-07-28 08:21:16,321 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 83% complete
> 2011-07-28 08:21:34,439 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
> 2011-07-28 08:21:34,441 [main] INFO  org.apache.pig.tools.pigstats.PigStats - Script Statistics:
> HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt    Features
> 0.20-append    0.8.1-SNAPSHOT    root    2011-07-28 08:17:51    2011-07-28 08:21:34    HASH_JOIN,FILTER,LIMIT
> Success!
> Job Stats (time in seconds):
> JobId    Maps    Reduces    MaxMapTime    MinMapTIme    AvgMapTime    MaxReduceTime    MinReduceTime    AvgReduceTime    Alias    Feature    Outputs
> job_201107251336_0314    2    1    75    66    70    63    63    63    end_sessions,sessions,start_sessions    HASH_JOIN
> job_201107251336_0315    4    1    15    6    12    24    24    24    infos,sessions    HASH_JOIN
> job_201107251336_0316    1    1    3    3    3    12    12    12            hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740,
> Input(s):
> Successfully read 2069446 records from: "endSession.mde253811.preprod.ubithere.com"
> Successfully read 2072419 records from: "startSession.mde253811.preprod.ubithere.com"
> Successfully read 19441 records from: "info.mde253811.preprod.ubithere.com"
> Output(s):
> Successfully stored 0 records in: "hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740"
> Counters:
> Total records written : 0
> Total bytes written : 0
> Spillable Memory Manager spill count : 0
> Total bags proactively spilled: 1
> Total records proactively spilled: 1944943
> Job DAG:
> job_201107251336_0314    ->    job_201107251336_0315,
> job_201107251336_0315    ->    job_201107251336_0316,
> job_201107251336_0316
> 2011-07-28 08:21:34,472 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
> 2011-07-28 08:21:34,500 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:21:34,501 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> grunt>

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (PIG-2193) Problem with HBase loader 0.90.3 and PIG 0.8.1

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073024#comment-13073024 ] 

Raghu Angadi commented on PIG-2193:
-----------------------------------

This was introduced in PIG-1870.
TestHBaseStorage passes with the patch.. though I expected projection tests to fail.

> Problem with HBase loader 0.90.3 and PIG 0.8.1
> ----------------------------------------------
>
>                 Key: PIG-2193
>                 URL: https://issues.apache.org/jira/browse/PIG-2193
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: HBase 0.90.3, Hadoop 0.20-append
>            Reporter: Vincent BARAT
>         Attachments: PIG-2193.patch
>
>
> I've some data in HBase 0.90.3 and I run a simple script on them.
> This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).
> When data are loaded from HDFS instead of HBase, the script runs perfectly.
> Here is the script loading from HDFS (works): 
> start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> The same script loading from HBase (don't work):
> start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> I guess it definitively means there is a nasty bug in the HBase loader.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2193) Problem with HBase loader 0.90.3 and PIG 0.8.1

Posted by "Vincent BARAT (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072261#comment-13072261 ] 

Vincent BARAT commented on PIG-2193:
------------------------------------

Of course if I do the same test reading data from HDFS instead of HBase, I got the good values for the 'infoid' and 'locid' fields in both case.

> Problem with HBase loader 0.90.3 and PIG 0.8.1
> ----------------------------------------------
>
>                 Key: PIG-2193
>                 URL: https://issues.apache.org/jira/browse/PIG-2193
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: HBase 0.90.3, Hadoop 0.20-append
>            Reporter: Vincent BARAT
>
> I've some data in HBase 0.90.3 and I run a simple script on them.
> This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).
> When data are loaded from HDFS instead of HBase, the script runs perfectly.
> Here is the script loading from HDFS (works): 
> start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> The same script loading from HBase (don't work):
> start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> I guess it definitively means there is a nasty bug in the HBase loader.
> Here is the PIG dump for the HBase version:
> aws09:~# pig
> 2011-07-28 08:17:36,329 [main] INFO  org.apache.pig.Main - Logging error messages to: /root/pig_1311841056328.log
> 2011-07-28 08:17:36,641 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://aws09.preprod.ubithere.com:9000
> 2011-07-28 08:17:36,923 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: aws09.preprod.ubithere.com:9001
> grunt> start_sessions = LOAD 'startSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> grunt> end_sessions = LOAD 'endSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> grunt> infos = LOAD 'info.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> grunt> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> grunt> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> grunt> sessions = JOIN sessions BY infoid, infos BY infoid;
> grunt> sessions = LIMIT sessions 100;
> grunt> dump sessions;
> 2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: HASH_JOIN,FILTER,LIMIT
> 2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - pig.usenewlogicalplan is set to true. New logical plan will be used.
> 2011-07-28 08:17:51,213 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: sessions: Store(hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740:org.apache.pig.impl.io.InterStorage) - scope-93 Operator Key: scope-93)
> 2011-07-28 08:17:51,225 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
> 2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
> 2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
> 2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3
> 2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 3
> 2011-07-28 08:17:51,402 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:17:51,411 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32 GMT
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:host.name=aws09.machine.com
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.version=1.6.0_22
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Sun Microsystems Inc.
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.22/jre
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=/opt/pig/bin/../conf:/usr/lib/jvm/java-6-sun/jre/lib/tools.jar:/opt/pig/bin/../pig-0.8.1-core.jar:/opt/pig/bin/../build/pig-*-SNAPSHOT.jar:/opt/pig/bin/../lib/commons-el-1.0.jar:/opt/pig/bin/../lib/commons-lang-2.4.jar:/opt/pig/bin/../lib/commons-logging-1.1.1.jar:/opt/pig/bin/../lib/guava-r06.jar:/opt/pig/bin/../lib/hbase-0.90.3.jar:/opt/pig/bin/../lib/hsqldb-1.8.0.10.jar:/opt/pig/bin/../lib/jackson-core-asl-1.0.1.jar:/opt/pig/bin/../lib/jackson-mapper-asl-1.0.1.jar:/opt/pig/bin/../lib/javacc-4.2.jar:/opt/pig/bin/../lib/javacc.jar:/opt/pig/bin/../lib/jetty-util-6.1.14.jar:/opt/pig/bin/../lib/jline-0.9.94.jar:/opt/pig/bin/../lib/joda-time-1.6.jar:/opt/pig/bin/../lib/jsch-0.1.38.jar:/opt/pig/bin/../lib/junit-4.5.jar:/opt/pig/bin/../lib/jython-2.5.0.jar:/opt/pig/bin/../lib/log4j-1.2.14.jar:/opt/pig/bin/../lib/pigudfs.jar:/opt/pig/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/pig/bin/../lib/zookeeper-3.3.2.jar:/opt/hadoop/conf_computation:/opt/hbase/conf:/opt/pig/lib/hadoop-0.20-append-core.jar
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64/server:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/tmp
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=<NA>
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.name=Linux
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.arch=amd64
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.version=2.6.21.7-2.fc8xen-ec2-v1.0
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.name=root
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.home=/root
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/root
> 2011-07-28 08:17:51,471 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:51,493 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:51,499 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:51,508 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054b, negotiated timeout = 60000
> 2011-07-28 08:17:51,575 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@ef894ce; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:51,687 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:51,696 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:51,700 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:51,726 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:51,729 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:53,328 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=0
> 2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
> 2011-07-28 08:17:53,442 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:17:53,944 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
> 2011-07-28 08:17:53,989 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:53,990 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:53,991 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:53,996 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054c, negotiated timeout = 60000
> 2011-07-28 08:17:54,000 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2d9f90e3; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,005 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,006 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,011 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:54,017 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,017 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,018 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,025 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054d, negotiated timeout = 60000
> 2011-07-28 08:17:54,029 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@41f6321; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,032 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,033 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,037 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:54,039 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:17:54,067 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:17:54,109 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,110 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,111 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,119 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054e, negotiated timeout = 60000
> 2011-07-28 08:17:54,123 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@20c3e967; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,140 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,142 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,148 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:54,154 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,158 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,159 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,161 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054f, negotiated timeout = 60000
> 2011-07-28 08:17:54,164 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5ee771f3; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,167 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,169 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,172 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:54,173 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:17:54,180 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0314
> 2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0314
> 2011-07-28 08:19:06,924 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 8% complete
> 2011-07-28 08:19:15,971 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 16% complete
> 2011-07-28 08:19:18,985 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 19% complete
> 2011-07-28 08:19:25,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 27% complete
> 2011-07-28 08:20:14,810 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:20:14,812 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:20:14,830 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:14,831 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:14,832 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:14,838 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60556, negotiated timeout = 60000
> 2011-07-28 08:20:14,842 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5d4fa79d; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:14,847 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:14,849 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:14,852 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,311 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=198330658
> 2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
> 2011-07-28 08:20:16,341 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> 2011-07-28 08:20:16,693 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:16,694 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:16,695 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:16,702 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60557, negotiated timeout = 60000
> 2011-07-28 08:20:16,705 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@37285252; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:16,709 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:16,710 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:16,714 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,716 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:16,717 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:16,718 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:16,720 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60558, negotiated timeout = 60000
> 2011-07-28 08:20:16,723 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7418e252; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:16,726 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:16,727 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:16,730 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,732 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:20:16,772 [Thread-32] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0315
> 2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0315
> 2011-07-28 08:20:28,075 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 37% complete
> 2011-07-28 08:20:34,106 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 41% complete
> 2011-07-28 08:20:37,124 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
> 2011-07-28 08:20:46,168 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 51% complete
> 2011-07-28 08:20:49,183 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 61% complete
> 2011-07-28 08:20:52,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 62% complete
> 2011-07-28 08:20:55,214 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 64% complete
> 2011-07-28 08:21:01,244 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 66% complete
> 2011-07-28 08:21:07,311 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:21:07,312 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:21:08,770 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:21:08,778 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> 2011-07-28 08:21:08,911 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
> 2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0316
> 2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0316
> 2011-07-28 08:21:16,321 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 83% complete
> 2011-07-28 08:21:34,439 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
> 2011-07-28 08:21:34,441 [main] INFO  org.apache.pig.tools.pigstats.PigStats - Script Statistics:
> HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt    Features
> 0.20-append    0.8.1-SNAPSHOT    root    2011-07-28 08:17:51    2011-07-28 08:21:34    HASH_JOIN,FILTER,LIMIT
> Success!
> Job Stats (time in seconds):
> JobId    Maps    Reduces    MaxMapTime    MinMapTIme    AvgMapTime    MaxReduceTime    MinReduceTime    AvgReduceTime    Alias    Feature    Outputs
> job_201107251336_0314    2    1    75    66    70    63    63    63    end_sessions,sessions,start_sessions    HASH_JOIN
> job_201107251336_0315    4    1    15    6    12    24    24    24    infos,sessions    HASH_JOIN
> job_201107251336_0316    1    1    3    3    3    12    12    12            hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740,
> Input(s):
> Successfully read 2069446 records from: "endSession.mde253811.preprod.ubithere.com"
> Successfully read 2072419 records from: "startSession.mde253811.preprod.ubithere.com"
> Successfully read 19441 records from: "info.mde253811.preprod.ubithere.com"
> Output(s):
> Successfully stored 0 records in: "hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740"
> Counters:
> Total records written : 0
> Total bytes written : 0
> Spillable Memory Manager spill count : 0
> Total bags proactively spilled: 1
> Total records proactively spilled: 1944943
> Job DAG:
> job_201107251336_0314    ->    job_201107251336_0315,
> job_201107251336_0315    ->    job_201107251336_0316,
> job_201107251336_0316
> 2011-07-28 08:21:34,472 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
> 2011-07-28 08:21:34,500 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:21:34,501 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> grunt>

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (PIG-2193) Using HBaseStorage to scan 2 tables in the same Map job produces bad data

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy updated PIG-2193:
-----------------------------------

    Summary: Using HBaseStorage to scan 2 tables in the same Map job produces bad data  (was: Problem with HBase loader 0.90.3 and PIG 0.8.1)

> Using HBaseStorage to scan 2 tables in the same Map job produces bad data
> -------------------------------------------------------------------------
>
>                 Key: PIG-2193
>                 URL: https://issues.apache.org/jira/browse/PIG-2193
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: HBase 0.90.3, Hadoop 0.20-append
>            Reporter: Vincent BARAT
>            Assignee: Raghu Angadi
>             Fix For: 0.9.1, 0.10
>
>         Attachments: PIG-2193.patch, PIG-2193.patch
>
>
> I've some data in HBase 0.90.3 and I run a simple script on them.
> This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).
> When data are loaded from HDFS instead of HBase, the script runs perfectly.
> Here is the script loading from HDFS (works): 
> start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> The same script loading from HBase (don't work):
> start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> I guess it definitively means there is a nasty bug in the HBase loader.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2193) Problem with HBase loader 0.90.3 and PIG 0.8.1

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073069#comment-13073069 ] 

Raghu Angadi commented on PIG-2193:
-----------------------------------

Our tests verify that we return the correct columns to users with projections.
How can we verify that HBase scanner actually fetches only the projected columns? It seems hard to get hold of the Scan object.

> Problem with HBase loader 0.90.3 and PIG 0.8.1
> ----------------------------------------------
>
>                 Key: PIG-2193
>                 URL: https://issues.apache.org/jira/browse/PIG-2193
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: HBase 0.90.3, Hadoop 0.20-append
>            Reporter: Vincent BARAT
>         Attachments: PIG-2193.patch
>
>
> I've some data in HBase 0.90.3 and I run a simple script on them.
> This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).
> When data are loaded from HDFS instead of HBase, the script runs perfectly.
> Here is the script loading from HDFS (works): 
> start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> The same script loading from HBase (don't work):
> start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> I guess it definitively means there is a nasty bug in the HBase loader.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2193) Problem with HBase loader 0.90.3 and PIG 0.8.1

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated PIG-2193:
------------------------------

    Attachment: PIG-2193.patch

The patch ignores the scanner conf in jobConf. This seems to fix my script.

> Problem with HBase loader 0.90.3 and PIG 0.8.1
> ----------------------------------------------
>
>                 Key: PIG-2193
>                 URL: https://issues.apache.org/jira/browse/PIG-2193
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: HBase 0.90.3, Hadoop 0.20-append
>            Reporter: Vincent BARAT
>         Attachments: PIG-2193.patch
>
>
> I've some data in HBase 0.90.3 and I run a simple script on them.
> This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).
> When data are loaded from HDFS instead of HBase, the script runs perfectly.
> Here is the script loading from HDFS (works): 
> start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> The same script loading from HBase (don't work):
> start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> I guess it definitively means there is a nasty bug in the HBase loader.
> Here is the PIG dump for the HBase version:
> aws09:~# pig
> 2011-07-28 08:17:36,329 [main] INFO  org.apache.pig.Main - Logging error messages to: /root/pig_1311841056328.log
> 2011-07-28 08:17:36,641 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://aws09.preprod.ubithere.com:9000
> 2011-07-28 08:17:36,923 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: aws09.preprod.ubithere.com:9001
> grunt> start_sessions = LOAD 'startSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> grunt> end_sessions = LOAD 'endSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> grunt> infos = LOAD 'info.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> grunt> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> grunt> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> grunt> sessions = JOIN sessions BY infoid, infos BY infoid;
> grunt> sessions = LIMIT sessions 100;
> grunt> dump sessions;
> 2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: HASH_JOIN,FILTER,LIMIT
> 2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - pig.usenewlogicalplan is set to true. New logical plan will be used.
> 2011-07-28 08:17:51,213 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: sessions: Store(hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740:org.apache.pig.impl.io.InterStorage) - scope-93 Operator Key: scope-93)
> 2011-07-28 08:17:51,225 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
> 2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
> 2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
> 2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3
> 2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 3
> 2011-07-28 08:17:51,402 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:17:51,411 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32 GMT
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:host.name=aws09.machine.com
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.version=1.6.0_22
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Sun Microsystems Inc.
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.22/jre
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=/opt/pig/bin/../conf:/usr/lib/jvm/java-6-sun/jre/lib/tools.jar:/opt/pig/bin/../pig-0.8.1-core.jar:/opt/pig/bin/../build/pig-*-SNAPSHOT.jar:/opt/pig/bin/../lib/commons-el-1.0.jar:/opt/pig/bin/../lib/commons-lang-2.4.jar:/opt/pig/bin/../lib/commons-logging-1.1.1.jar:/opt/pig/bin/../lib/guava-r06.jar:/opt/pig/bin/../lib/hbase-0.90.3.jar:/opt/pig/bin/../lib/hsqldb-1.8.0.10.jar:/opt/pig/bin/../lib/jackson-core-asl-1.0.1.jar:/opt/pig/bin/../lib/jackson-mapper-asl-1.0.1.jar:/opt/pig/bin/../lib/javacc-4.2.jar:/opt/pig/bin/../lib/javacc.jar:/opt/pig/bin/../lib/jetty-util-6.1.14.jar:/opt/pig/bin/../lib/jline-0.9.94.jar:/opt/pig/bin/../lib/joda-time-1.6.jar:/opt/pig/bin/../lib/jsch-0.1.38.jar:/opt/pig/bin/../lib/junit-4.5.jar:/opt/pig/bin/../lib/jython-2.5.0.jar:/opt/pig/bin/../lib/log4j-1.2.14.jar:/opt/pig/bin/../lib/pigudfs.jar:/opt/pig/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/pig/bin/../lib/zookeeper-3.3.2.jar:/opt/hadoop/conf_computation:/opt/hbase/conf:/opt/pig/lib/hadoop-0.20-append-core.jar
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64/server:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/tmp
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=<NA>
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.name=Linux
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.arch=amd64
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.version=2.6.21.7-2.fc8xen-ec2-v1.0
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.name=root
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.home=/root
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/root
> 2011-07-28 08:17:51,471 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:51,493 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:51,499 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:51,508 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054b, negotiated timeout = 60000
> 2011-07-28 08:17:51,575 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@ef894ce; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:51,687 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:51,696 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:51,700 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:51,726 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:51,729 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:53,328 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=0
> 2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
> 2011-07-28 08:17:53,442 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:17:53,944 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
> 2011-07-28 08:17:53,989 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:53,990 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:53,991 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:53,996 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054c, negotiated timeout = 60000
> 2011-07-28 08:17:54,000 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2d9f90e3; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,005 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,006 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,011 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:54,017 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,017 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,018 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,025 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054d, negotiated timeout = 60000
> 2011-07-28 08:17:54,029 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@41f6321; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,032 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,033 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,037 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:54,039 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:17:54,067 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:17:54,109 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,110 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,111 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,119 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054e, negotiated timeout = 60000
> 2011-07-28 08:17:54,123 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@20c3e967; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,140 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,142 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,148 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:54,154 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,158 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,159 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,161 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054f, negotiated timeout = 60000
> 2011-07-28 08:17:54,164 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5ee771f3; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,167 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,169 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,172 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:54,173 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:17:54,180 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0314
> 2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0314
> 2011-07-28 08:19:06,924 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 8% complete
> 2011-07-28 08:19:15,971 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 16% complete
> 2011-07-28 08:19:18,985 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 19% complete
> 2011-07-28 08:19:25,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 27% complete
> 2011-07-28 08:20:14,810 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:20:14,812 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:20:14,830 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:14,831 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:14,832 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:14,838 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60556, negotiated timeout = 60000
> 2011-07-28 08:20:14,842 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5d4fa79d; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:14,847 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:14,849 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:14,852 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,311 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=198330658
> 2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
> 2011-07-28 08:20:16,341 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> 2011-07-28 08:20:16,693 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:16,694 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:16,695 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:16,702 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60557, negotiated timeout = 60000
> 2011-07-28 08:20:16,705 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@37285252; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:16,709 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:16,710 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:16,714 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,716 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:16,717 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:16,718 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:16,720 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60558, negotiated timeout = 60000
> 2011-07-28 08:20:16,723 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7418e252; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:16,726 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:16,727 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:16,730 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,732 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:20:16,772 [Thread-32] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0315
> 2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0315
> 2011-07-28 08:20:28,075 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 37% complete
> 2011-07-28 08:20:34,106 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 41% complete
> 2011-07-28 08:20:37,124 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
> 2011-07-28 08:20:46,168 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 51% complete
> 2011-07-28 08:20:49,183 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 61% complete
> 2011-07-28 08:20:52,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 62% complete
> 2011-07-28 08:20:55,214 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 64% complete
> 2011-07-28 08:21:01,244 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 66% complete
> 2011-07-28 08:21:07,311 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:21:07,312 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:21:08,770 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:21:08,778 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> 2011-07-28 08:21:08,911 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
> 2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0316
> 2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0316
> 2011-07-28 08:21:16,321 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 83% complete
> 2011-07-28 08:21:34,439 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
> 2011-07-28 08:21:34,441 [main] INFO  org.apache.pig.tools.pigstats.PigStats - Script Statistics:
> HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt    Features
> 0.20-append    0.8.1-SNAPSHOT    root    2011-07-28 08:17:51    2011-07-28 08:21:34    HASH_JOIN,FILTER,LIMIT
> Success!
> Job Stats (time in seconds):
> JobId    Maps    Reduces    MaxMapTime    MinMapTIme    AvgMapTime    MaxReduceTime    MinReduceTime    AvgReduceTime    Alias    Feature    Outputs
> job_201107251336_0314    2    1    75    66    70    63    63    63    end_sessions,sessions,start_sessions    HASH_JOIN
> job_201107251336_0315    4    1    15    6    12    24    24    24    infos,sessions    HASH_JOIN
> job_201107251336_0316    1    1    3    3    3    12    12    12            hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740,
> Input(s):
> Successfully read 2069446 records from: "endSession.mde253811.preprod.ubithere.com"
> Successfully read 2072419 records from: "startSession.mde253811.preprod.ubithere.com"
> Successfully read 19441 records from: "info.mde253811.preprod.ubithere.com"
> Output(s):
> Successfully stored 0 records in: "hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740"
> Counters:
> Total records written : 0
> Total bytes written : 0
> Spillable Memory Manager spill count : 0
> Total bags proactively spilled: 1
> Total records proactively spilled: 1944943
> Job DAG:
> job_201107251336_0314    ->    job_201107251336_0315,
> job_201107251336_0315    ->    job_201107251336_0316,
> job_201107251336_0316
> 2011-07-28 08:21:34,472 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
> 2011-07-28 08:21:34,500 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:21:34,501 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> grunt>

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (PIG-2193) Problem with HBase loader 0.90.3 and PIG 0.8.1

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072631#comment-13072631 ] 

Raghu Angadi commented on PIG-2193:
-----------------------------------

I am able to reproduced this with PIG-0.8.1. But in my case commenting out FILTER does not make a difference. I think it is because of the how HBase scanner is initialized on the backend. This script results in 2 mappers which are supposed to have two different scanners, but both are initialized from the same config. I will verify.

> Problem with HBase loader 0.90.3 and PIG 0.8.1
> ----------------------------------------------
>
>                 Key: PIG-2193
>                 URL: https://issues.apache.org/jira/browse/PIG-2193
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: HBase 0.90.3, Hadoop 0.20-append
>            Reporter: Vincent BARAT
>
> I've some data in HBase 0.90.3 and I run a simple script on them.
> This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).
> When data are loaded from HDFS instead of HBase, the script runs perfectly.
> Here is the script loading from HDFS (works): 
> start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> The same script loading from HBase (don't work):
> start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> I guess it definitively means there is a nasty bug in the HBase loader.
> Here is the PIG dump for the HBase version:
> aws09:~# pig
> 2011-07-28 08:17:36,329 [main] INFO  org.apache.pig.Main - Logging error messages to: /root/pig_1311841056328.log
> 2011-07-28 08:17:36,641 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://aws09.preprod.ubithere.com:9000
> 2011-07-28 08:17:36,923 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: aws09.preprod.ubithere.com:9001
> grunt> start_sessions = LOAD 'startSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> grunt> end_sessions = LOAD 'endSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> grunt> infos = LOAD 'info.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> grunt> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> grunt> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> grunt> sessions = JOIN sessions BY infoid, infos BY infoid;
> grunt> sessions = LIMIT sessions 100;
> grunt> dump sessions;
> 2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: HASH_JOIN,FILTER,LIMIT
> 2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - pig.usenewlogicalplan is set to true. New logical plan will be used.
> 2011-07-28 08:17:51,213 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: sessions: Store(hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740:org.apache.pig.impl.io.InterStorage) - scope-93 Operator Key: scope-93)
> 2011-07-28 08:17:51,225 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
> 2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
> 2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
> 2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3
> 2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 3
> 2011-07-28 08:17:51,402 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:17:51,411 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32 GMT
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:host.name=aws09.machine.com
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.version=1.6.0_22
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Sun Microsystems Inc.
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.22/jre
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=/opt/pig/bin/../conf:/usr/lib/jvm/java-6-sun/jre/lib/tools.jar:/opt/pig/bin/../pig-0.8.1-core.jar:/opt/pig/bin/../build/pig-*-SNAPSHOT.jar:/opt/pig/bin/../lib/commons-el-1.0.jar:/opt/pig/bin/../lib/commons-lang-2.4.jar:/opt/pig/bin/../lib/commons-logging-1.1.1.jar:/opt/pig/bin/../lib/guava-r06.jar:/opt/pig/bin/../lib/hbase-0.90.3.jar:/opt/pig/bin/../lib/hsqldb-1.8.0.10.jar:/opt/pig/bin/../lib/jackson-core-asl-1.0.1.jar:/opt/pig/bin/../lib/jackson-mapper-asl-1.0.1.jar:/opt/pig/bin/../lib/javacc-4.2.jar:/opt/pig/bin/../lib/javacc.jar:/opt/pig/bin/../lib/jetty-util-6.1.14.jar:/opt/pig/bin/../lib/jline-0.9.94.jar:/opt/pig/bin/../lib/joda-time-1.6.jar:/opt/pig/bin/../lib/jsch-0.1.38.jar:/opt/pig/bin/../lib/junit-4.5.jar:/opt/pig/bin/../lib/jython-2.5.0.jar:/opt/pig/bin/../lib/log4j-1.2.14.jar:/opt/pig/bin/../lib/pigudfs.jar:/opt/pig/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/pig/bin/../lib/zookeeper-3.3.2.jar:/opt/hadoop/conf_computation:/opt/hbase/conf:/opt/pig/lib/hadoop-0.20-append-core.jar
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64/server:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/tmp
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=<NA>
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.name=Linux
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.arch=amd64
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.version=2.6.21.7-2.fc8xen-ec2-v1.0
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.name=root
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.home=/root
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/root
> 2011-07-28 08:17:51,471 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:51,493 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:51,499 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:51,508 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054b, negotiated timeout = 60000
> 2011-07-28 08:17:51,575 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@ef894ce; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:51,687 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:51,696 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:51,700 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:51,726 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:51,729 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:53,328 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=0
> 2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
> 2011-07-28 08:17:53,442 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:17:53,944 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
> 2011-07-28 08:17:53,989 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:53,990 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:53,991 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:53,996 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054c, negotiated timeout = 60000
> 2011-07-28 08:17:54,000 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2d9f90e3; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,005 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,006 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,011 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:54,017 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,017 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,018 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,025 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054d, negotiated timeout = 60000
> 2011-07-28 08:17:54,029 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@41f6321; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,032 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,033 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,037 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:54,039 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:17:54,067 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:17:54,109 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,110 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,111 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,119 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054e, negotiated timeout = 60000
> 2011-07-28 08:17:54,123 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@20c3e967; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,140 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,142 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,148 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:54,154 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,158 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,159 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,161 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054f, negotiated timeout = 60000
> 2011-07-28 08:17:54,164 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5ee771f3; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,167 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,169 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,172 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:54,173 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:17:54,180 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0314
> 2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0314
> 2011-07-28 08:19:06,924 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 8% complete
> 2011-07-28 08:19:15,971 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 16% complete
> 2011-07-28 08:19:18,985 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 19% complete
> 2011-07-28 08:19:25,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 27% complete
> 2011-07-28 08:20:14,810 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:20:14,812 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:20:14,830 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:14,831 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:14,832 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:14,838 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60556, negotiated timeout = 60000
> 2011-07-28 08:20:14,842 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5d4fa79d; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:14,847 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:14,849 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:14,852 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,311 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=198330658
> 2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
> 2011-07-28 08:20:16,341 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> 2011-07-28 08:20:16,693 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:16,694 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:16,695 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:16,702 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60557, negotiated timeout = 60000
> 2011-07-28 08:20:16,705 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@37285252; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:16,709 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:16,710 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:16,714 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,716 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:16,717 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:16,718 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:16,720 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60558, negotiated timeout = 60000
> 2011-07-28 08:20:16,723 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7418e252; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:16,726 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:16,727 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:16,730 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,732 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:20:16,772 [Thread-32] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0315
> 2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0315
> 2011-07-28 08:20:28,075 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 37% complete
> 2011-07-28 08:20:34,106 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 41% complete
> 2011-07-28 08:20:37,124 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
> 2011-07-28 08:20:46,168 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 51% complete
> 2011-07-28 08:20:49,183 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 61% complete
> 2011-07-28 08:20:52,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 62% complete
> 2011-07-28 08:20:55,214 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 64% complete
> 2011-07-28 08:21:01,244 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 66% complete
> 2011-07-28 08:21:07,311 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:21:07,312 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:21:08,770 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:21:08,778 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> 2011-07-28 08:21:08,911 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
> 2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0316
> 2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0316
> 2011-07-28 08:21:16,321 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 83% complete
> 2011-07-28 08:21:34,439 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
> 2011-07-28 08:21:34,441 [main] INFO  org.apache.pig.tools.pigstats.PigStats - Script Statistics:
> HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt    Features
> 0.20-append    0.8.1-SNAPSHOT    root    2011-07-28 08:17:51    2011-07-28 08:21:34    HASH_JOIN,FILTER,LIMIT
> Success!
> Job Stats (time in seconds):
> JobId    Maps    Reduces    MaxMapTime    MinMapTIme    AvgMapTime    MaxReduceTime    MinReduceTime    AvgReduceTime    Alias    Feature    Outputs
> job_201107251336_0314    2    1    75    66    70    63    63    63    end_sessions,sessions,start_sessions    HASH_JOIN
> job_201107251336_0315    4    1    15    6    12    24    24    24    infos,sessions    HASH_JOIN
> job_201107251336_0316    1    1    3    3    3    12    12    12            hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740,
> Input(s):
> Successfully read 2069446 records from: "endSession.mde253811.preprod.ubithere.com"
> Successfully read 2072419 records from: "startSession.mde253811.preprod.ubithere.com"
> Successfully read 19441 records from: "info.mde253811.preprod.ubithere.com"
> Output(s):
> Successfully stored 0 records in: "hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740"
> Counters:
> Total records written : 0
> Total bytes written : 0
> Spillable Memory Manager spill count : 0
> Total bags proactively spilled: 1
> Total records proactively spilled: 1944943
> Job DAG:
> job_201107251336_0314    ->    job_201107251336_0315,
> job_201107251336_0315    ->    job_201107251336_0316,
> job_201107251336_0316
> 2011-07-28 08:21:34,472 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
> 2011-07-28 08:21:34,500 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:21:34,501 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> grunt>

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (PIG-2193) Problem with HBase loader 0.90.3 and PIG 0.8.1

Posted by "Vincent BARAT (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072763#comment-13072763 ] 

Vincent BARAT commented on PIG-2193:
------------------------------------

Do you have this patch adapted for PIG 0.8.1, apparently it does not apply directly

> Problem with HBase loader 0.90.3 and PIG 0.8.1
> ----------------------------------------------
>
>                 Key: PIG-2193
>                 URL: https://issues.apache.org/jira/browse/PIG-2193
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: HBase 0.90.3, Hadoop 0.20-append
>            Reporter: Vincent BARAT
>         Attachments: PIG-2193.patch
>
>
> I've some data in HBase 0.90.3 and I run a simple script on them.
> This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).
> When data are loaded from HDFS instead of HBase, the script runs perfectly.
> Here is the script loading from HDFS (works): 
> start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> The same script loading from HBase (don't work):
> start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> I guess it definitively means there is a nasty bug in the HBase loader.
> Here is the PIG dump for the HBase version:
> aws09:~# pig
> 2011-07-28 08:17:36,329 [main] INFO  org.apache.pig.Main - Logging error messages to: /root/pig_1311841056328.log
> 2011-07-28 08:17:36,641 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://aws09.preprod.ubithere.com:9000
> 2011-07-28 08:17:36,923 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: aws09.preprod.ubithere.com:9001
> grunt> start_sessions = LOAD 'startSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> grunt> end_sessions = LOAD 'endSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> grunt> infos = LOAD 'info.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> grunt> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> grunt> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> grunt> sessions = JOIN sessions BY infoid, infos BY infoid;
> grunt> sessions = LIMIT sessions 100;
> grunt> dump sessions;
> 2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: HASH_JOIN,FILTER,LIMIT
> 2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - pig.usenewlogicalplan is set to true. New logical plan will be used.
> 2011-07-28 08:17:51,213 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: sessions: Store(hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740:org.apache.pig.impl.io.InterStorage) - scope-93 Operator Key: scope-93)
> 2011-07-28 08:17:51,225 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
> 2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
> 2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
> 2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3
> 2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 3
> 2011-07-28 08:17:51,402 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:17:51,411 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32 GMT
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:host.name=aws09.machine.com
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.version=1.6.0_22
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Sun Microsystems Inc.
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.22/jre
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=/opt/pig/bin/../conf:/usr/lib/jvm/java-6-sun/jre/lib/tools.jar:/opt/pig/bin/../pig-0.8.1-core.jar:/opt/pig/bin/../build/pig-*-SNAPSHOT.jar:/opt/pig/bin/../lib/commons-el-1.0.jar:/opt/pig/bin/../lib/commons-lang-2.4.jar:/opt/pig/bin/../lib/commons-logging-1.1.1.jar:/opt/pig/bin/../lib/guava-r06.jar:/opt/pig/bin/../lib/hbase-0.90.3.jar:/opt/pig/bin/../lib/hsqldb-1.8.0.10.jar:/opt/pig/bin/../lib/jackson-core-asl-1.0.1.jar:/opt/pig/bin/../lib/jackson-mapper-asl-1.0.1.jar:/opt/pig/bin/../lib/javacc-4.2.jar:/opt/pig/bin/../lib/javacc.jar:/opt/pig/bin/../lib/jetty-util-6.1.14.jar:/opt/pig/bin/../lib/jline-0.9.94.jar:/opt/pig/bin/../lib/joda-time-1.6.jar:/opt/pig/bin/../lib/jsch-0.1.38.jar:/opt/pig/bin/../lib/junit-4.5.jar:/opt/pig/bin/../lib/jython-2.5.0.jar:/opt/pig/bin/../lib/log4j-1.2.14.jar:/opt/pig/bin/../lib/pigudfs.jar:/opt/pig/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/pig/bin/../lib/zookeeper-3.3.2.jar:/opt/hadoop/conf_computation:/opt/hbase/conf:/opt/pig/lib/hadoop-0.20-append-core.jar
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64/server:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/tmp
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=<NA>
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.name=Linux
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.arch=amd64
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.version=2.6.21.7-2.fc8xen-ec2-v1.0
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.name=root
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.home=/root
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/root
> 2011-07-28 08:17:51,471 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:51,493 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:51,499 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:51,508 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054b, negotiated timeout = 60000
> 2011-07-28 08:17:51,575 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@ef894ce; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:51,687 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:51,696 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:51,700 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:51,726 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:51,729 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:53,328 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=0
> 2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
> 2011-07-28 08:17:53,442 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:17:53,944 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
> 2011-07-28 08:17:53,989 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:53,990 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:53,991 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:53,996 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054c, negotiated timeout = 60000
> 2011-07-28 08:17:54,000 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2d9f90e3; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,005 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,006 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,011 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:54,017 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,017 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,018 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,025 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054d, negotiated timeout = 60000
> 2011-07-28 08:17:54,029 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@41f6321; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,032 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,033 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,037 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:54,039 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:17:54,067 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:17:54,109 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,110 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,111 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,119 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054e, negotiated timeout = 60000
> 2011-07-28 08:17:54,123 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@20c3e967; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,140 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,142 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,148 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:54,154 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,158 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,159 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,161 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054f, negotiated timeout = 60000
> 2011-07-28 08:17:54,164 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5ee771f3; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,167 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,169 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,172 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:54,173 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:17:54,180 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0314
> 2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0314
> 2011-07-28 08:19:06,924 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 8% complete
> 2011-07-28 08:19:15,971 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 16% complete
> 2011-07-28 08:19:18,985 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 19% complete
> 2011-07-28 08:19:25,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 27% complete
> 2011-07-28 08:20:14,810 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:20:14,812 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:20:14,830 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:14,831 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:14,832 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:14,838 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60556, negotiated timeout = 60000
> 2011-07-28 08:20:14,842 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5d4fa79d; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:14,847 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:14,849 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:14,852 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,311 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=198330658
> 2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
> 2011-07-28 08:20:16,341 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> 2011-07-28 08:20:16,693 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:16,694 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:16,695 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:16,702 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60557, negotiated timeout = 60000
> 2011-07-28 08:20:16,705 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@37285252; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:16,709 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:16,710 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:16,714 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,716 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:16,717 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:16,718 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:16,720 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60558, negotiated timeout = 60000
> 2011-07-28 08:20:16,723 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7418e252; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:16,726 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:16,727 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:16,730 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,732 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:20:16,772 [Thread-32] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0315
> 2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0315
> 2011-07-28 08:20:28,075 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 37% complete
> 2011-07-28 08:20:34,106 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 41% complete
> 2011-07-28 08:20:37,124 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
> 2011-07-28 08:20:46,168 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 51% complete
> 2011-07-28 08:20:49,183 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 61% complete
> 2011-07-28 08:20:52,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 62% complete
> 2011-07-28 08:20:55,214 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 64% complete
> 2011-07-28 08:21:01,244 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 66% complete
> 2011-07-28 08:21:07,311 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:21:07,312 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:21:08,770 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:21:08,778 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> 2011-07-28 08:21:08,911 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
> 2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0316
> 2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0316
> 2011-07-28 08:21:16,321 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 83% complete
> 2011-07-28 08:21:34,439 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
> 2011-07-28 08:21:34,441 [main] INFO  org.apache.pig.tools.pigstats.PigStats - Script Statistics:
> HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt    Features
> 0.20-append    0.8.1-SNAPSHOT    root    2011-07-28 08:17:51    2011-07-28 08:21:34    HASH_JOIN,FILTER,LIMIT
> Success!
> Job Stats (time in seconds):
> JobId    Maps    Reduces    MaxMapTime    MinMapTIme    AvgMapTime    MaxReduceTime    MinReduceTime    AvgReduceTime    Alias    Feature    Outputs
> job_201107251336_0314    2    1    75    66    70    63    63    63    end_sessions,sessions,start_sessions    HASH_JOIN
> job_201107251336_0315    4    1    15    6    12    24    24    24    infos,sessions    HASH_JOIN
> job_201107251336_0316    1    1    3    3    3    12    12    12            hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740,
> Input(s):
> Successfully read 2069446 records from: "endSession.mde253811.preprod.ubithere.com"
> Successfully read 2072419 records from: "startSession.mde253811.preprod.ubithere.com"
> Successfully read 19441 records from: "info.mde253811.preprod.ubithere.com"
> Output(s):
> Successfully stored 0 records in: "hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740"
> Counters:
> Total records written : 0
> Total bytes written : 0
> Spillable Memory Manager spill count : 0
> Total bags proactively spilled: 1
> Total records proactively spilled: 1944943
> Job DAG:
> job_201107251336_0314    ->    job_201107251336_0315,
> job_201107251336_0315    ->    job_201107251336_0316,
> job_201107251336_0316
> 2011-07-28 08:21:34,472 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
> 2011-07-28 08:21:34,500 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:21:34,501 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> grunt>

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (PIG-2193) Problem with HBase loader 0.90.3 and PIG 0.8.1

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072946#comment-13072946 ] 

Dmitriy V. Ryaboy commented on PIG-2193:
----------------------------------------

I have this nagging feeling that there was a reason I put the early termination in there (which is what appears to be causing the problem). Raghu, can you add a unit test for this?

> Problem with HBase loader 0.90.3 and PIG 0.8.1
> ----------------------------------------------
>
>                 Key: PIG-2193
>                 URL: https://issues.apache.org/jira/browse/PIG-2193
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: HBase 0.90.3, Hadoop 0.20-append
>            Reporter: Vincent BARAT
>         Attachments: PIG-2193.patch
>
>
> I've some data in HBase 0.90.3 and I run a simple script on them.
> This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).
> When data are loaded from HDFS instead of HBase, the script runs perfectly.
> Here is the script loading from HDFS (works): 
> start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> The same script loading from HBase (don't work):
> start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> I guess it definitively means there is a nasty bug in the HBase loader.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2193) Problem with HBase loader 0.90.3 and PIG 0.8.1

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089202#comment-13089202 ] 

Dmitriy V. Ryaboy commented on PIG-2193:
----------------------------------------

+1 will commit to 0.9 and trunk

> Problem with HBase loader 0.90.3 and PIG 0.8.1
> ----------------------------------------------
>
>                 Key: PIG-2193
>                 URL: https://issues.apache.org/jira/browse/PIG-2193
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: HBase 0.90.3, Hadoop 0.20-append
>            Reporter: Vincent BARAT
>            Assignee: Raghu Angadi
>             Fix For: 0.9.1, 0.10
>
>         Attachments: PIG-2193.patch, PIG-2193.patch
>
>
> I've some data in HBase 0.90.3 and I run a simple script on them.
> This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).
> When data are loaded from HDFS instead of HBase, the script runs perfectly.
> Here is the script loading from HDFS (works): 
> start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> The same script loading from HBase (don't work):
> start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> I guess it definitively means there is a nasty bug in the HBase loader.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2193) Using HBaseStorage to scan 2 tables in the same Map job produces bad data

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy updated PIG-2193:
-----------------------------------

      Resolution: Fixed
    Release Note: Joining two HBase tables sometimes failed because mappers got confused about which of the tables they were supposed to scan.
          Status: Resolved  (was: Patch Available)

> Using HBaseStorage to scan 2 tables in the same Map job produces bad data
> -------------------------------------------------------------------------
>
>                 Key: PIG-2193
>                 URL: https://issues.apache.org/jira/browse/PIG-2193
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: HBase 0.90.3, Hadoop 0.20-append
>            Reporter: Vincent BARAT
>            Assignee: Raghu Angadi
>             Fix For: 0.9.1, 0.10
>
>         Attachments: PIG-2193.patch, PIG-2193.patch
>
>
> I've some data in HBase 0.90.3 and I run a simple script on them.
> This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).
> When data are loaded from HDFS instead of HBase, the script runs perfectly.
> Here is the script loading from HDFS (works): 
> start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> The same script loading from HBase (don't work):
> start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> I guess it definitively means there is a nasty bug in the HBase loader.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2193) Problem with HBase loader 0.90.3 and PIG 0.8.1

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated PIG-2193:
------------------------------

    Fix Version/s: 0.10
                   0.9.1
         Assignee: Raghu Angadi
           Status: Patch Available  (was: Open)

> Problem with HBase loader 0.90.3 and PIG 0.8.1
> ----------------------------------------------
>
>                 Key: PIG-2193
>                 URL: https://issues.apache.org/jira/browse/PIG-2193
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: HBase 0.90.3, Hadoop 0.20-append
>            Reporter: Vincent BARAT
>            Assignee: Raghu Angadi
>             Fix For: 0.9.1, 0.10
>
>         Attachments: PIG-2193.patch, PIG-2193.patch
>
>
> I've some data in HBase 0.90.3 and I run a simple script on them.
> This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).
> When data are loaded from HDFS instead of HBase, the script runs perfectly.
> Here is the script loading from HDFS (works): 
> start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> The same script loading from HBase (don't work):
> start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> I guess it definitively means there is a nasty bug in the HBase loader.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2193) Problem with HBase loader 0.90.3 and PIG 0.8.1

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072855#comment-13072855 ] 

Raghu Angadi commented on PIG-2193:
-----------------------------------

PIG-2193.patch I attached is for 0.8 branch. It applies fine. Did you try it on clean 0.8.1? If you want to try on your code, your just need to comment out 3 lines in HBaseStorage::setLocation().

> Problem with HBase loader 0.90.3 and PIG 0.8.1
> ----------------------------------------------
>
>                 Key: PIG-2193
>                 URL: https://issues.apache.org/jira/browse/PIG-2193
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: HBase 0.90.3, Hadoop 0.20-append
>            Reporter: Vincent BARAT
>         Attachments: PIG-2193.patch
>
>
> I've some data in HBase 0.90.3 and I run a simple script on them.
> This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).
> When data are loaded from HDFS instead of HBase, the script runs perfectly.
> Here is the script loading from HDFS (works): 
> start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> The same script loading from HBase (don't work):
> start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> I guess it definitively means there is a nasty bug in the HBase loader.
> Here is the PIG dump for the HBase version:
> aws09:~# pig
> 2011-07-28 08:17:36,329 [main] INFO  org.apache.pig.Main - Logging error messages to: /root/pig_1311841056328.log
> 2011-07-28 08:17:36,641 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://aws09.preprod.ubithere.com:9000
> 2011-07-28 08:17:36,923 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: aws09.preprod.ubithere.com:9001
> grunt> start_sessions = LOAD 'startSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> grunt> end_sessions = LOAD 'endSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> grunt> infos = LOAD 'info.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> grunt> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> grunt> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> grunt> sessions = JOIN sessions BY infoid, infos BY infoid;
> grunt> sessions = LIMIT sessions 100;
> grunt> dump sessions;
> 2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: HASH_JOIN,FILTER,LIMIT
> 2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - pig.usenewlogicalplan is set to true. New logical plan will be used.
> 2011-07-28 08:17:51,213 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: sessions: Store(hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740:org.apache.pig.impl.io.InterStorage) - scope-93 Operator Key: scope-93)
> 2011-07-28 08:17:51,225 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
> 2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
> 2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
> 2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3
> 2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 3
> 2011-07-28 08:17:51,402 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:17:51,411 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32 GMT
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:host.name=aws09.machine.com
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.version=1.6.0_22
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Sun Microsystems Inc.
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.22/jre
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=/opt/pig/bin/../conf:/usr/lib/jvm/java-6-sun/jre/lib/tools.jar:/opt/pig/bin/../pig-0.8.1-core.jar:/opt/pig/bin/../build/pig-*-SNAPSHOT.jar:/opt/pig/bin/../lib/commons-el-1.0.jar:/opt/pig/bin/../lib/commons-lang-2.4.jar:/opt/pig/bin/../lib/commons-logging-1.1.1.jar:/opt/pig/bin/../lib/guava-r06.jar:/opt/pig/bin/../lib/hbase-0.90.3.jar:/opt/pig/bin/../lib/hsqldb-1.8.0.10.jar:/opt/pig/bin/../lib/jackson-core-asl-1.0.1.jar:/opt/pig/bin/../lib/jackson-mapper-asl-1.0.1.jar:/opt/pig/bin/../lib/javacc-4.2.jar:/opt/pig/bin/../lib/javacc.jar:/opt/pig/bin/../lib/jetty-util-6.1.14.jar:/opt/pig/bin/../lib/jline-0.9.94.jar:/opt/pig/bin/../lib/joda-time-1.6.jar:/opt/pig/bin/../lib/jsch-0.1.38.jar:/opt/pig/bin/../lib/junit-4.5.jar:/opt/pig/bin/../lib/jython-2.5.0.jar:/opt/pig/bin/../lib/log4j-1.2.14.jar:/opt/pig/bin/../lib/pigudfs.jar:/opt/pig/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/pig/bin/../lib/zookeeper-3.3.2.jar:/opt/hadoop/conf_computation:/opt/hbase/conf:/opt/pig/lib/hadoop-0.20-append-core.jar
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64/server:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/tmp
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=<NA>
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.name=Linux
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.arch=amd64
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.version=2.6.21.7-2.fc8xen-ec2-v1.0
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.name=root
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.home=/root
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/root
> 2011-07-28 08:17:51,471 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:51,493 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:51,499 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:51,508 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054b, negotiated timeout = 60000
> 2011-07-28 08:17:51,575 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@ef894ce; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:51,687 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:51,696 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:51,700 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:51,726 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:51,729 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:53,328 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=0
> 2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
> 2011-07-28 08:17:53,442 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:17:53,944 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
> 2011-07-28 08:17:53,989 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:53,990 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:53,991 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:53,996 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054c, negotiated timeout = 60000
> 2011-07-28 08:17:54,000 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2d9f90e3; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,005 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,006 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,011 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:54,017 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,017 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,018 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,025 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054d, negotiated timeout = 60000
> 2011-07-28 08:17:54,029 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@41f6321; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,032 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,033 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,037 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:54,039 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:17:54,067 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:17:54,109 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,110 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,111 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,119 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054e, negotiated timeout = 60000
> 2011-07-28 08:17:54,123 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@20c3e967; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,140 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,142 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,148 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:54,154 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,158 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,159 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,161 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054f, negotiated timeout = 60000
> 2011-07-28 08:17:54,164 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5ee771f3; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,167 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,169 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,172 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:54,173 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:17:54,180 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0314
> 2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0314
> 2011-07-28 08:19:06,924 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 8% complete
> 2011-07-28 08:19:15,971 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 16% complete
> 2011-07-28 08:19:18,985 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 19% complete
> 2011-07-28 08:19:25,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 27% complete
> 2011-07-28 08:20:14,810 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:20:14,812 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:20:14,830 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:14,831 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:14,832 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:14,838 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60556, negotiated timeout = 60000
> 2011-07-28 08:20:14,842 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5d4fa79d; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:14,847 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:14,849 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:14,852 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,311 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=198330658
> 2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
> 2011-07-28 08:20:16,341 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> 2011-07-28 08:20:16,693 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:16,694 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:16,695 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:16,702 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60557, negotiated timeout = 60000
> 2011-07-28 08:20:16,705 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@37285252; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:16,709 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:16,710 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:16,714 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,716 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:16,717 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:16,718 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:16,720 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60558, negotiated timeout = 60000
> 2011-07-28 08:20:16,723 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7418e252; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:16,726 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:16,727 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:16,730 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,732 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:20:16,772 [Thread-32] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0315
> 2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0315
> 2011-07-28 08:20:28,075 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 37% complete
> 2011-07-28 08:20:34,106 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 41% complete
> 2011-07-28 08:20:37,124 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
> 2011-07-28 08:20:46,168 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 51% complete
> 2011-07-28 08:20:49,183 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 61% complete
> 2011-07-28 08:20:52,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 62% complete
> 2011-07-28 08:20:55,214 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 64% complete
> 2011-07-28 08:21:01,244 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 66% complete
> 2011-07-28 08:21:07,311 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:21:07,312 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:21:08,770 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:21:08,778 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> 2011-07-28 08:21:08,911 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
> 2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0316
> 2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0316
> 2011-07-28 08:21:16,321 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 83% complete
> 2011-07-28 08:21:34,439 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
> 2011-07-28 08:21:34,441 [main] INFO  org.apache.pig.tools.pigstats.PigStats - Script Statistics:
> HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt    Features
> 0.20-append    0.8.1-SNAPSHOT    root    2011-07-28 08:17:51    2011-07-28 08:21:34    HASH_JOIN,FILTER,LIMIT
> Success!
> Job Stats (time in seconds):
> JobId    Maps    Reduces    MaxMapTime    MinMapTIme    AvgMapTime    MaxReduceTime    MinReduceTime    AvgReduceTime    Alias    Feature    Outputs
> job_201107251336_0314    2    1    75    66    70    63    63    63    end_sessions,sessions,start_sessions    HASH_JOIN
> job_201107251336_0315    4    1    15    6    12    24    24    24    infos,sessions    HASH_JOIN
> job_201107251336_0316    1    1    3    3    3    12    12    12            hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740,
> Input(s):
> Successfully read 2069446 records from: "endSession.mde253811.preprod.ubithere.com"
> Successfully read 2072419 records from: "startSession.mde253811.preprod.ubithere.com"
> Successfully read 19441 records from: "info.mde253811.preprod.ubithere.com"
> Output(s):
> Successfully stored 0 records in: "hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740"
> Counters:
> Total records written : 0
> Total bytes written : 0
> Spillable Memory Manager spill count : 0
> Total bags proactively spilled: 1
> Total records proactively spilled: 1944943
> Job DAG:
> job_201107251336_0314    ->    job_201107251336_0315,
> job_201107251336_0315    ->    job_201107251336_0316,
> job_201107251336_0316
> 2011-07-28 08:21:34,472 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
> 2011-07-28 08:21:34,500 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:21:34,501 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> grunt>

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (PIG-2193) Problem with HBase loader 0.90.3 and PIG 0.8.1

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073808#comment-13073808 ] 

Dmitriy V. Ryaboy commented on PIG-2193:
----------------------------------------

I will review.

> Problem with HBase loader 0.90.3 and PIG 0.8.1
> ----------------------------------------------
>
>                 Key: PIG-2193
>                 URL: https://issues.apache.org/jira/browse/PIG-2193
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: HBase 0.90.3, Hadoop 0.20-append
>            Reporter: Vincent BARAT
>            Assignee: Raghu Angadi
>             Fix For: 0.9.1, 0.10
>
>         Attachments: PIG-2193.patch, PIG-2193.patch
>
>
> I've some data in HBase 0.90.3 and I run a simple script on them.
> This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).
> When data are loaded from HDFS instead of HBase, the script runs perfectly.
> Here is the script loading from HDFS (works): 
> start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> The same script loading from HBase (don't work):
> start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> I guess it definitively means there is a nasty bug in the HBase loader.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2193) Problem with HBase loader 0.90.3 and PIG 0.8.1

Posted by "Vincent BARAT (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072824#comment-13072824 ] 

Vincent BARAT commented on PIG-2193:
------------------------------------

We have tried PIG trunk patched with PIG-2174 without success: same behavior.
The patch does not apply on PIG 0.8.1, did you adapt it ?

> Problem with HBase loader 0.90.3 and PIG 0.8.1
> ----------------------------------------------
>
>                 Key: PIG-2193
>                 URL: https://issues.apache.org/jira/browse/PIG-2193
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: HBase 0.90.3, Hadoop 0.20-append
>            Reporter: Vincent BARAT
>         Attachments: PIG-2193.patch
>
>
> I've some data in HBase 0.90.3 and I run a simple script on them.
> This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).
> When data are loaded from HDFS instead of HBase, the script runs perfectly.
> Here is the script loading from HDFS (works): 
> start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> The same script loading from HBase (don't work):
> start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> I guess it definitively means there is a nasty bug in the HBase loader.
> Here is the PIG dump for the HBase version:
> aws09:~# pig
> 2011-07-28 08:17:36,329 [main] INFO  org.apache.pig.Main - Logging error messages to: /root/pig_1311841056328.log
> 2011-07-28 08:17:36,641 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://aws09.preprod.ubithere.com:9000
> 2011-07-28 08:17:36,923 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: aws09.preprod.ubithere.com:9001
> grunt> start_sessions = LOAD 'startSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> grunt> end_sessions = LOAD 'endSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> grunt> infos = LOAD 'info.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> grunt> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> grunt> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> grunt> sessions = JOIN sessions BY infoid, infos BY infoid;
> grunt> sessions = LIMIT sessions 100;
> grunt> dump sessions;
> 2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: HASH_JOIN,FILTER,LIMIT
> 2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - pig.usenewlogicalplan is set to true. New logical plan will be used.
> 2011-07-28 08:17:51,213 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: sessions: Store(hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740:org.apache.pig.impl.io.InterStorage) - scope-93 Operator Key: scope-93)
> 2011-07-28 08:17:51,225 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
> 2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
> 2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
> 2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3
> 2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 3
> 2011-07-28 08:17:51,402 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:17:51,411 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32 GMT
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:host.name=aws09.machine.com
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.version=1.6.0_22
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Sun Microsystems Inc.
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.22/jre
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=/opt/pig/bin/../conf:/usr/lib/jvm/java-6-sun/jre/lib/tools.jar:/opt/pig/bin/../pig-0.8.1-core.jar:/opt/pig/bin/../build/pig-*-SNAPSHOT.jar:/opt/pig/bin/../lib/commons-el-1.0.jar:/opt/pig/bin/../lib/commons-lang-2.4.jar:/opt/pig/bin/../lib/commons-logging-1.1.1.jar:/opt/pig/bin/../lib/guava-r06.jar:/opt/pig/bin/../lib/hbase-0.90.3.jar:/opt/pig/bin/../lib/hsqldb-1.8.0.10.jar:/opt/pig/bin/../lib/jackson-core-asl-1.0.1.jar:/opt/pig/bin/../lib/jackson-mapper-asl-1.0.1.jar:/opt/pig/bin/../lib/javacc-4.2.jar:/opt/pig/bin/../lib/javacc.jar:/opt/pig/bin/../lib/jetty-util-6.1.14.jar:/opt/pig/bin/../lib/jline-0.9.94.jar:/opt/pig/bin/../lib/joda-time-1.6.jar:/opt/pig/bin/../lib/jsch-0.1.38.jar:/opt/pig/bin/../lib/junit-4.5.jar:/opt/pig/bin/../lib/jython-2.5.0.jar:/opt/pig/bin/../lib/log4j-1.2.14.jar:/opt/pig/bin/../lib/pigudfs.jar:/opt/pig/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/pig/bin/../lib/zookeeper-3.3.2.jar:/opt/hadoop/conf_computation:/opt/hbase/conf:/opt/pig/lib/hadoop-0.20-append-core.jar
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64/server:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/tmp
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=<NA>
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.name=Linux
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.arch=amd64
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.version=2.6.21.7-2.fc8xen-ec2-v1.0
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.name=root
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.home=/root
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/root
> 2011-07-28 08:17:51,471 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:51,493 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:51,499 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:51,508 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054b, negotiated timeout = 60000
> 2011-07-28 08:17:51,575 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@ef894ce; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:51,687 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:51,696 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:51,700 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:51,726 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:51,729 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:53,328 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=0
> 2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
> 2011-07-28 08:17:53,442 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:17:53,944 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
> 2011-07-28 08:17:53,989 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:53,990 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:53,991 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:53,996 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054c, negotiated timeout = 60000
> 2011-07-28 08:17:54,000 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2d9f90e3; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,005 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,006 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,011 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:54,017 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,017 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,018 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,025 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054d, negotiated timeout = 60000
> 2011-07-28 08:17:54,029 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@41f6321; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,032 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,033 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,037 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:54,039 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:17:54,067 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:17:54,109 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,110 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,111 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,119 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054e, negotiated timeout = 60000
> 2011-07-28 08:17:54,123 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@20c3e967; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,140 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,142 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,148 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:54,154 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,158 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,159 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,161 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054f, negotiated timeout = 60000
> 2011-07-28 08:17:54,164 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5ee771f3; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,167 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,169 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,172 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:54,173 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:17:54,180 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0314
> 2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0314
> 2011-07-28 08:19:06,924 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 8% complete
> 2011-07-28 08:19:15,971 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 16% complete
> 2011-07-28 08:19:18,985 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 19% complete
> 2011-07-28 08:19:25,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 27% complete
> 2011-07-28 08:20:14,810 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:20:14,812 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:20:14,830 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:14,831 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:14,832 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:14,838 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60556, negotiated timeout = 60000
> 2011-07-28 08:20:14,842 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5d4fa79d; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:14,847 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:14,849 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:14,852 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,311 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=198330658
> 2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
> 2011-07-28 08:20:16,341 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> 2011-07-28 08:20:16,693 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:16,694 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:16,695 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:16,702 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60557, negotiated timeout = 60000
> 2011-07-28 08:20:16,705 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@37285252; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:16,709 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:16,710 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:16,714 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,716 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:16,717 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:16,718 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:16,720 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60558, negotiated timeout = 60000
> 2011-07-28 08:20:16,723 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7418e252; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:16,726 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:16,727 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:16,730 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,732 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:20:16,772 [Thread-32] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0315
> 2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0315
> 2011-07-28 08:20:28,075 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 37% complete
> 2011-07-28 08:20:34,106 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 41% complete
> 2011-07-28 08:20:37,124 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
> 2011-07-28 08:20:46,168 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 51% complete
> 2011-07-28 08:20:49,183 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 61% complete
> 2011-07-28 08:20:52,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 62% complete
> 2011-07-28 08:20:55,214 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 64% complete
> 2011-07-28 08:21:01,244 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 66% complete
> 2011-07-28 08:21:07,311 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:21:07,312 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:21:08,770 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:21:08,778 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> 2011-07-28 08:21:08,911 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
> 2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0316
> 2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0316
> 2011-07-28 08:21:16,321 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 83% complete
> 2011-07-28 08:21:34,439 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
> 2011-07-28 08:21:34,441 [main] INFO  org.apache.pig.tools.pigstats.PigStats - Script Statistics:
> HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt    Features
> 0.20-append    0.8.1-SNAPSHOT    root    2011-07-28 08:17:51    2011-07-28 08:21:34    HASH_JOIN,FILTER,LIMIT
> Success!
> Job Stats (time in seconds):
> JobId    Maps    Reduces    MaxMapTime    MinMapTIme    AvgMapTime    MaxReduceTime    MinReduceTime    AvgReduceTime    Alias    Feature    Outputs
> job_201107251336_0314    2    1    75    66    70    63    63    63    end_sessions,sessions,start_sessions    HASH_JOIN
> job_201107251336_0315    4    1    15    6    12    24    24    24    infos,sessions    HASH_JOIN
> job_201107251336_0316    1    1    3    3    3    12    12    12            hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740,
> Input(s):
> Successfully read 2069446 records from: "endSession.mde253811.preprod.ubithere.com"
> Successfully read 2072419 records from: "startSession.mde253811.preprod.ubithere.com"
> Successfully read 19441 records from: "info.mde253811.preprod.ubithere.com"
> Output(s):
> Successfully stored 0 records in: "hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740"
> Counters:
> Total records written : 0
> Total bytes written : 0
> Spillable Memory Manager spill count : 0
> Total bags proactively spilled: 1
> Total records proactively spilled: 1944943
> Job DAG:
> job_201107251336_0314    ->    job_201107251336_0315,
> job_201107251336_0315    ->    job_201107251336_0316,
> job_201107251336_0316
> 2011-07-28 08:21:34,472 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
> 2011-07-28 08:21:34,500 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:21:34,501 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> grunt>

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (PIG-2193) Problem with HBase loader 0.90.3 and PIG 0.8.1

Posted by "Vincent BARAT (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078677#comment-13078677 ] 

Vincent BARAT commented on PIG-2193:
------------------------------------

We have tested this patch on the PIG trunk with success.
Big thanks for your help guys!
I guess we will use PIG 0.9 + this patch in production. Or maybe wait for PIG 0.9.1 if this patch is to be included in.


> Problem with HBase loader 0.90.3 and PIG 0.8.1
> ----------------------------------------------
>
>                 Key: PIG-2193
>                 URL: https://issues.apache.org/jira/browse/PIG-2193
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: HBase 0.90.3, Hadoop 0.20-append
>            Reporter: Vincent BARAT
>            Assignee: Raghu Angadi
>             Fix For: 0.9.1, 0.10
>
>         Attachments: PIG-2193.patch, PIG-2193.patch
>
>
> I've some data in HBase 0.90.3 and I run a simple script on them.
> This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).
> When data are loaded from HDFS instead of HBase, the script runs perfectly.
> Here is the script loading from HDFS (works): 
> start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> The same script loading from HBase (don't work):
> start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> I guess it definitively means there is a nasty bug in the HBase loader.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2193) Problem with HBase loader 0.90.3 and PIG 0.8.1

Posted by "Bill Graham (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072434#comment-13072434 ] 

Bill Graham commented on PIG-2193:
----------------------------------

@Vincent, can you please try applying the patch in PIG-2174 and report back if it makes a difference?

> Problem with HBase loader 0.90.3 and PIG 0.8.1
> ----------------------------------------------
>
>                 Key: PIG-2193
>                 URL: https://issues.apache.org/jira/browse/PIG-2193
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: HBase 0.90.3, Hadoop 0.20-append
>            Reporter: Vincent BARAT
>
> I've some data in HBase 0.90.3 and I run a simple script on them.
> This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).
> When data are loaded from HDFS instead of HBase, the script runs perfectly.
> Here is the script loading from HDFS (works): 
> start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> The same script loading from HBase (don't work):
> start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> I guess it definitively means there is a nasty bug in the HBase loader.
> Here is the PIG dump for the HBase version:
> aws09:~# pig
> 2011-07-28 08:17:36,329 [main] INFO  org.apache.pig.Main - Logging error messages to: /root/pig_1311841056328.log
> 2011-07-28 08:17:36,641 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://aws09.preprod.ubithere.com:9000
> 2011-07-28 08:17:36,923 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: aws09.preprod.ubithere.com:9001
> grunt> start_sessions = LOAD 'startSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> grunt> end_sessions = LOAD 'endSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> grunt> infos = LOAD 'info.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> grunt> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> grunt> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> grunt> sessions = JOIN sessions BY infoid, infos BY infoid;
> grunt> sessions = LIMIT sessions 100;
> grunt> dump sessions;
> 2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: HASH_JOIN,FILTER,LIMIT
> 2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - pig.usenewlogicalplan is set to true. New logical plan will be used.
> 2011-07-28 08:17:51,213 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: sessions: Store(hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740:org.apache.pig.impl.io.InterStorage) - scope-93 Operator Key: scope-93)
> 2011-07-28 08:17:51,225 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
> 2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
> 2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
> 2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3
> 2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 3
> 2011-07-28 08:17:51,402 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:17:51,411 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32 GMT
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:host.name=aws09.machine.com
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.version=1.6.0_22
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Sun Microsystems Inc.
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.22/jre
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=/opt/pig/bin/../conf:/usr/lib/jvm/java-6-sun/jre/lib/tools.jar:/opt/pig/bin/../pig-0.8.1-core.jar:/opt/pig/bin/../build/pig-*-SNAPSHOT.jar:/opt/pig/bin/../lib/commons-el-1.0.jar:/opt/pig/bin/../lib/commons-lang-2.4.jar:/opt/pig/bin/../lib/commons-logging-1.1.1.jar:/opt/pig/bin/../lib/guava-r06.jar:/opt/pig/bin/../lib/hbase-0.90.3.jar:/opt/pig/bin/../lib/hsqldb-1.8.0.10.jar:/opt/pig/bin/../lib/jackson-core-asl-1.0.1.jar:/opt/pig/bin/../lib/jackson-mapper-asl-1.0.1.jar:/opt/pig/bin/../lib/javacc-4.2.jar:/opt/pig/bin/../lib/javacc.jar:/opt/pig/bin/../lib/jetty-util-6.1.14.jar:/opt/pig/bin/../lib/jline-0.9.94.jar:/opt/pig/bin/../lib/joda-time-1.6.jar:/opt/pig/bin/../lib/jsch-0.1.38.jar:/opt/pig/bin/../lib/junit-4.5.jar:/opt/pig/bin/../lib/jython-2.5.0.jar:/opt/pig/bin/../lib/log4j-1.2.14.jar:/opt/pig/bin/../lib/pigudfs.jar:/opt/pig/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/pig/bin/../lib/zookeeper-3.3.2.jar:/opt/hadoop/conf_computation:/opt/hbase/conf:/opt/pig/lib/hadoop-0.20-append-core.jar
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64/server:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/tmp
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=<NA>
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.name=Linux
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.arch=amd64
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.version=2.6.21.7-2.fc8xen-ec2-v1.0
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.name=root
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.home=/root
> 2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/root
> 2011-07-28 08:17:51,471 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:51,493 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:51,499 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:51,508 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054b, negotiated timeout = 60000
> 2011-07-28 08:17:51,575 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@ef894ce; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:51,687 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:51,696 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:51,700 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:51,726 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:51,729 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:53,328 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=0
> 2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
> 2011-07-28 08:17:53,442 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:17:53,944 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
> 2011-07-28 08:17:53,989 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:53,990 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:53,991 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:53,996 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054c, negotiated timeout = 60000
> 2011-07-28 08:17:54,000 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2d9f90e3; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,005 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,006 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,011 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:54,017 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,017 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,018 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,025 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054d, negotiated timeout = 60000
> 2011-07-28 08:17:54,029 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@41f6321; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,032 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,033 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,037 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
> 2011-07-28 08:17:54,039 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:17:54,067 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:17:54,109 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,110 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,111 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,119 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054e, negotiated timeout = 60000
> 2011-07-28 08:17:54,123 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@20c3e967; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,140 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,142 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,148 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:54,154 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:17:54,158 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:17:54,159 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:17:54,161 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054f, negotiated timeout = 60000
> 2011-07-28 08:17:54,164 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5ee771f3; hsa=aws03.machine.com:60020
> 2011-07-28 08:17:54,167 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:17:54,169 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:17:54,172 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
> 2011-07-28 08:17:54,173 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:17:54,180 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0314
> 2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0314
> 2011-07-28 08:19:06,924 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 8% complete
> 2011-07-28 08:19:15,971 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 16% complete
> 2011-07-28 08:19:18,985 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 19% complete
> 2011-07-28 08:19:25,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 27% complete
> 2011-07-28 08:20:14,810 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:20:14,812 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:20:14,830 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:14,831 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:14,832 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:14,838 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60556, negotiated timeout = 60000
> 2011-07-28 08:20:14,842 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5d4fa79d; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:14,847 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:14,849 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:14,852 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,311 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=198330658
> 2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
> 2011-07-28 08:20:16,341 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> 2011-07-28 08:20:16,693 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:16,694 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:16,695 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:16,702 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60557, negotiated timeout = 60000
> 2011-07-28 08:20:16,705 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@37285252; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:16,709 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:16,710 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:16,714 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,716 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
> 2011-07-28 08:20:16,717 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
> 2011-07-28 08:20:16,718 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
> 2011-07-28 08:20:16,720 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60558, negotiated timeout = 60000
> 2011-07-28 08:20:16,723 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7418e252; hsa=aws03.machine.com:60020
> 2011-07-28 08:20:16,726 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
> 2011-07-28 08:20:16,727 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
> 2011-07-28 08:20:16,730 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
> 2011-07-28 08:20:16,732 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
> 2011-07-28 08:20:16,772 [Thread-32] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
> 2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
> 2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
> 2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0315
> 2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0315
> 2011-07-28 08:20:28,075 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 37% complete
> 2011-07-28 08:20:34,106 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 41% complete
> 2011-07-28 08:20:37,124 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
> 2011-07-28 08:20:46,168 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 51% complete
> 2011-07-28 08:20:49,183 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 61% complete
> 2011-07-28 08:20:52,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 62% complete
> 2011-07-28 08:20:55,214 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 64% complete
> 2011-07-28 08:21:01,244 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 66% complete
> 2011-07-28 08:21:07,311 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2011-07-28 08:21:07,312 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-07-28 08:21:08,770 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2011-07-28 08:21:08,778 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> 2011-07-28 08:21:08,911 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
> 2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0316
> 2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0316
> 2011-07-28 08:21:16,321 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 83% complete
> 2011-07-28 08:21:34,439 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
> 2011-07-28 08:21:34,441 [main] INFO  org.apache.pig.tools.pigstats.PigStats - Script Statistics:
> HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt    Features
> 0.20-append    0.8.1-SNAPSHOT    root    2011-07-28 08:17:51    2011-07-28 08:21:34    HASH_JOIN,FILTER,LIMIT
> Success!
> Job Stats (time in seconds):
> JobId    Maps    Reduces    MaxMapTime    MinMapTIme    AvgMapTime    MaxReduceTime    MinReduceTime    AvgReduceTime    Alias    Feature    Outputs
> job_201107251336_0314    2    1    75    66    70    63    63    63    end_sessions,sessions,start_sessions    HASH_JOIN
> job_201107251336_0315    4    1    15    6    12    24    24    24    infos,sessions    HASH_JOIN
> job_201107251336_0316    1    1    3    3    3    12    12    12            hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740,
> Input(s):
> Successfully read 2069446 records from: "endSession.mde253811.preprod.ubithere.com"
> Successfully read 2072419 records from: "startSession.mde253811.preprod.ubithere.com"
> Successfully read 19441 records from: "info.mde253811.preprod.ubithere.com"
> Output(s):
> Successfully stored 0 records in: "hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740"
> Counters:
> Total records written : 0
> Total bytes written : 0
> Spillable Memory Manager spill count : 0
> Total bags proactively spilled: 1
> Total records proactively spilled: 1944943
> Job DAG:
> job_201107251336_0314    ->    job_201107251336_0315,
> job_201107251336_0315    ->    job_201107251336_0316,
> job_201107251336_0316
> 2011-07-28 08:21:34,472 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
> 2011-07-28 08:21:34,500 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2011-07-28 08:21:34,501 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> grunt>

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (PIG-2193) Problem with HBase loader 0.90.3 and PIG 0.8.1

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy updated PIG-2193:
-----------------------------------

    Description: 
I've some data in HBase 0.90.3 and I run a simple script on them.

This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).

When data are loaded from HDFS instead of HBase, the script runs perfectly.

Here is the script loading from HDFS (works): 

start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
sessions = JOIN start_sessions BY sid, end_sessions BY sid;
sessions = FILTER sessions BY end > start AND end - start < 86400000L;
sessions = JOIN sessions BY infoid, infos BY infoid;
sessions = LIMIT sessions 100;
dump sessions;

The same script loading from HBase (don't work):

start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
sessions = JOIN start_sessions BY sid, end_sessions BY sid;
sessions = FILTER sessions BY end > start AND end - start < 86400000L;
sessions = JOIN sessions BY infoid, infos BY infoid;
sessions = LIMIT sessions 100;
dump sessions;

I guess it definitively means there is a nasty bug in the HBase loader.



  was:
I've some data in HBase 0.90.3 and I run a simple script on them.

This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).

When data are loaded from HDFS instead of HBase, the script runs perfectly.

Here is the script loading from HDFS (works): 

start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
sessions = JOIN start_sessions BY sid, end_sessions BY sid;
sessions = FILTER sessions BY end > start AND end - start < 86400000L;
sessions = JOIN sessions BY infoid, infos BY infoid;
sessions = LIMIT sessions 100;
dump sessions;

The same script loading from HBase (don't work):

start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
sessions = JOIN start_sessions BY sid, end_sessions BY sid;
sessions = FILTER sessions BY end > start AND end - start < 86400000L;
sessions = JOIN sessions BY infoid, infos BY infoid;
sessions = LIMIT sessions 100;
dump sessions;

I guess it definitively means there is a nasty bug in the HBase loader.

Here is the PIG dump for the HBase version:

aws09:~# pig
2011-07-28 08:17:36,329 [main] INFO  org.apache.pig.Main - Logging error messages to: /root/pig_1311841056328.log
2011-07-28 08:17:36,641 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://aws09.preprod.ubithere.com:9000
2011-07-28 08:17:36,923 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: aws09.preprod.ubithere.com:9001
grunt> start_sessions = LOAD 'startSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
grunt> end_sessions = LOAD 'endSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
grunt> infos = LOAD 'info.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
grunt> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
grunt> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
grunt> sessions = JOIN sessions BY infoid, infos BY infoid;
grunt> sessions = LIMIT sessions 100;
grunt> dump sessions;
2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: HASH_JOIN,FILTER,LIMIT
2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - pig.usenewlogicalplan is set to true. New logical plan will be used.
2011-07-28 08:17:51,213 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: sessions: Store(hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740:org.apache.pig.impl.io.InterStorage) - scope-93 Operator Key: scope-93)
2011-07-28 08:17:51,225 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3
2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 3
2011-07-28 08:17:51,402 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2011-07-28 08:17:51,411 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32 GMT
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:host.name=aws09.machine.com
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.version=1.6.0_22
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Sun Microsystems Inc.
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.22/jre
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=/opt/pig/bin/../conf:/usr/lib/jvm/java-6-sun/jre/lib/tools.jar:/opt/pig/bin/../pig-0.8.1-core.jar:/opt/pig/bin/../build/pig-*-SNAPSHOT.jar:/opt/pig/bin/../lib/commons-el-1.0.jar:/opt/pig/bin/../lib/commons-lang-2.4.jar:/opt/pig/bin/../lib/commons-logging-1.1.1.jar:/opt/pig/bin/../lib/guava-r06.jar:/opt/pig/bin/../lib/hbase-0.90.3.jar:/opt/pig/bin/../lib/hsqldb-1.8.0.10.jar:/opt/pig/bin/../lib/jackson-core-asl-1.0.1.jar:/opt/pig/bin/../lib/jackson-mapper-asl-1.0.1.jar:/opt/pig/bin/../lib/javacc-4.2.jar:/opt/pig/bin/../lib/javacc.jar:/opt/pig/bin/../lib/jetty-util-6.1.14.jar:/opt/pig/bin/../lib/jline-0.9.94.jar:/opt/pig/bin/../lib/joda-time-1.6.jar:/opt/pig/bin/../lib/jsch-0.1.38.jar:/opt/pig/bin/../lib/junit-4.5.jar:/opt/pig/bin/../lib/jython-2.5.0.jar:/opt/pig/bin/../lib/log4j-1.2.14.jar:/opt/pig/bin/../lib/pigudfs.jar:/opt/pig/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/pig/bin/../lib/zookeeper-3.3.2.jar:/opt/hadoop/conf_computation:/opt/hbase/conf:/opt/pig/lib/hadoop-0.20-append-core.jar
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64/server:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/tmp
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=<NA>
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.name=Linux
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.arch=amd64
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.version=2.6.21.7-2.fc8xen-ec2-v1.0
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.name=root
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.home=/root
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/root
2011-07-28 08:17:51,471 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:51,493 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:51,499 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:51,508 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054b, negotiated timeout = 60000
2011-07-28 08:17:51,575 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@ef894ce; hsa=aws03.machine.com:60020
2011-07-28 08:17:51,687 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:51,696 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:51,700 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
2011-07-28 08:17:51,726 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:51,729 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
2011-07-28 08:17:53,328 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=0
2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2011-07-28 08:17:53,442 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2011-07-28 08:17:53,944 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2011-07-28 08:17:53,989 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:53,990 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:53,991 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:53,996 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054c, negotiated timeout = 60000
2011-07-28 08:17:54,000 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2d9f90e3; hsa=aws03.machine.com:60020
2011-07-28 08:17:54,005 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,006 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:54,011 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
2011-07-28 08:17:54,017 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:54,017 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:54,018 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:54,025 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054d, negotiated timeout = 60000
2011-07-28 08:17:54,029 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@41f6321; hsa=aws03.machine.com:60020
2011-07-28 08:17:54,032 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,033 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:54,037 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
2011-07-28 08:17:54,039 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
2011-07-28 08:17:54,067 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
2011-07-28 08:17:54,109 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:54,110 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:54,111 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:54,119 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054e, negotiated timeout = 60000
2011-07-28 08:17:54,123 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@20c3e967; hsa=aws03.machine.com:60020
2011-07-28 08:17:54,140 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,142 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:54,148 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
2011-07-28 08:17:54,154 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:54,158 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:54,159 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:54,161 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054f, negotiated timeout = 60000
2011-07-28 08:17:54,164 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5ee771f3; hsa=aws03.machine.com:60020
2011-07-28 08:17:54,167 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,169 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:54,172 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
2011-07-28 08:17:54,173 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
2011-07-28 08:17:54,180 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0314
2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0314
2011-07-28 08:19:06,924 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 8% complete
2011-07-28 08:19:15,971 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 16% complete
2011-07-28 08:19:18,985 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 19% complete
2011-07-28 08:19:25,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 27% complete
2011-07-28 08:20:14,810 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2011-07-28 08:20:14,812 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-07-28 08:20:14,830 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:20:14,831 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:20:14,832 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:20:14,838 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60556, negotiated timeout = 60000
2011-07-28 08:20:14,842 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5d4fa79d; hsa=aws03.machine.com:60020
2011-07-28 08:20:14,847 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:20:14,849 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:20:14,852 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
2011-07-28 08:20:16,311 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=198330658
2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2011-07-28 08:20:16,341 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2011-07-28 08:20:16,693 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:20:16,694 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:20:16,695 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:20:16,702 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60557, negotiated timeout = 60000
2011-07-28 08:20:16,705 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@37285252; hsa=aws03.machine.com:60020
2011-07-28 08:20:16,709 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:20:16,710 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:20:16,714 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
2011-07-28 08:20:16,716 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:20:16,717 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:20:16,718 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:20:16,720 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60558, negotiated timeout = 60000
2011-07-28 08:20:16,723 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7418e252; hsa=aws03.machine.com:60020
2011-07-28 08:20:16,726 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:20:16,727 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:20:16,730 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
2011-07-28 08:20:16,732 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
2011-07-28 08:20:16,772 [Thread-32] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0315
2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0315
2011-07-28 08:20:28,075 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 37% complete
2011-07-28 08:20:34,106 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 41% complete
2011-07-28 08:20:37,124 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2011-07-28 08:20:46,168 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 51% complete
2011-07-28 08:20:49,183 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 61% complete
2011-07-28 08:20:52,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 62% complete
2011-07-28 08:20:55,214 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 64% complete
2011-07-28 08:21:01,244 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 66% complete
2011-07-28 08:21:07,311 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2011-07-28 08:21:07,312 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-07-28 08:21:08,770 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2011-07-28 08:21:08,778 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2011-07-28 08:21:08,911 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0316
2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0316
2011-07-28 08:21:16,321 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 83% complete
2011-07-28 08:21:34,439 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2011-07-28 08:21:34,441 [main] INFO  org.apache.pig.tools.pigstats.PigStats - Script Statistics:

HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt    Features
0.20-append    0.8.1-SNAPSHOT    root    2011-07-28 08:17:51    2011-07-28 08:21:34    HASH_JOIN,FILTER,LIMIT

Success!

Job Stats (time in seconds):
JobId    Maps    Reduces    MaxMapTime    MinMapTIme    AvgMapTime    MaxReduceTime    MinReduceTime    AvgReduceTime    Alias    Feature    Outputs
job_201107251336_0314    2    1    75    66    70    63    63    63    end_sessions,sessions,start_sessions    HASH_JOIN
job_201107251336_0315    4    1    15    6    12    24    24    24    infos,sessions    HASH_JOIN
job_201107251336_0316    1    1    3    3    3    12    12    12            hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740,

Input(s):
Successfully read 2069446 records from: "endSession.mde253811.preprod.ubithere.com"
Successfully read 2072419 records from: "startSession.mde253811.preprod.ubithere.com"
Successfully read 19441 records from: "info.mde253811.preprod.ubithere.com"

Output(s):
Successfully stored 0 records in: "hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 1
Total records proactively spilled: 1944943

Job DAG:
job_201107251336_0314    ->    job_201107251336_0315,
job_201107251336_0315    ->    job_201107251336_0316,
job_201107251336_0316


2011-07-28 08:21:34,472 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2011-07-28 08:21:34,500 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2011-07-28 08:21:34,501 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
grunt>



Moving the trace from ticket description to a comment (long descriptions make the ticket hard to work with in jira).


Here is the PIG dump for the HBase version:

aws09:~# pig
2011-07-28 08:17:36,329 [main] INFO  org.apache.pig.Main - Logging error messages to: /root/pig_1311841056328.log
2011-07-28 08:17:36,641 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://aws09.preprod.ubithere.com:9000
2011-07-28 08:17:36,923 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: aws09.preprod.ubithere.com:9001
grunt> start_sessions = LOAD 'startSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
grunt> end_sessions = LOAD 'endSession.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
grunt> infos = LOAD 'info.mde253811.preprod.ubithere.com' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
grunt> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
grunt> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
grunt> sessions = JOIN sessions BY infoid, infos BY infoid;
grunt> sessions = LIMIT sessions 100;
grunt> dump sessions;
2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: HASH_JOIN,FILTER,LIMIT
2011-07-28 08:17:50,275 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - pig.usenewlogicalplan is set to true. New logical plan will be used.
2011-07-28 08:17:51,213 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: sessions: Store(hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740:org.apache.pig.impl.io.InterStorage) - scope-93 Operator Key: scope-93)
2011-07-28 08:17:51,225 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
2011-07-28 08:17:51,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3
2011-07-28 08:17:51,350 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 3
2011-07-28 08:17:51,402 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2011-07-28 08:17:51,411 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32 GMT
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:host.name=aws09.machine.com
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.version=1.6.0_22
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Sun Microsystems Inc.
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.22/jre
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=/opt/pig/bin/../conf:/usr/lib/jvm/java-6-sun/jre/lib/tools.jar:/opt/pig/bin/../pig-0.8.1-core.jar:/opt/pig/bin/../build/pig-*-SNAPSHOT.jar:/opt/pig/bin/../lib/commons-el-1.0.jar:/opt/pig/bin/../lib/commons-lang-2.4.jar:/opt/pig/bin/../lib/commons-logging-1.1.1.jar:/opt/pig/bin/../lib/guava-r06.jar:/opt/pig/bin/../lib/hbase-0.90.3.jar:/opt/pig/bin/../lib/hsqldb-1.8.0.10.jar:/opt/pig/bin/../lib/jackson-core-asl-1.0.1.jar:/opt/pig/bin/../lib/jackson-mapper-asl-1.0.1.jar:/opt/pig/bin/../lib/javacc-4.2.jar:/opt/pig/bin/../lib/javacc.jar:/opt/pig/bin/../lib/jetty-util-6.1.14.jar:/opt/pig/bin/../lib/jline-0.9.94.jar:/opt/pig/bin/../lib/joda-time-1.6.jar:/opt/pig/bin/../lib/jsch-0.1.38.jar:/opt/pig/bin/../lib/junit-4.5.jar:/opt/pig/bin/../lib/jython-2.5.0.jar:/opt/pig/bin/../lib/log4j-1.2.14.jar:/opt/pig/bin/../lib/pigudfs.jar:/opt/pig/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/pig/bin/../lib/zookeeper-3.3.2.jar:/opt/hadoop/conf_computation:/opt/hbase/conf:/opt/pig/lib/hadoop-0.20-append-core.jar
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64/server:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/tmp
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=<NA>
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.name=Linux
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.arch=amd64
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.version=2.6.21.7-2.fc8xen-ec2-v1.0
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.name=root
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.home=/root
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/root
2011-07-28 08:17:51,471 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:51,493 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:51,499 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:51,508 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054b, negotiated timeout = 60000
2011-07-28 08:17:51,575 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@ef894ce; hsa=aws03.machine.com:60020
2011-07-28 08:17:51,687 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:51,696 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:51,700 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
2011-07-28 08:17:51,726 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:51,729 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
2011-07-28 08:17:53,328 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=0
2011-07-28 08:17:53,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2011-07-28 08:17:53,442 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2011-07-28 08:17:53,944 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2011-07-28 08:17:53,989 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:53,990 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:53,991 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:53,996 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054c, negotiated timeout = 60000
2011-07-28 08:17:54,000 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2d9f90e3; hsa=aws03.machine.com:60020
2011-07-28 08:17:54,005 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,006 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:54,011 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
2011-07-28 08:17:54,017 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:54,017 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:54,018 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:54,025 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054d, negotiated timeout = 60000
2011-07-28 08:17:54,029 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@41f6321; hsa=aws03.machine.com:60020
2011-07-28 08:17:54,032 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,033 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:54,037 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is aws03.machine.com:60020
2011-07-28 08:17:54,039 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
2011-07-28 08:17:54,067 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
2011-07-28 08:17:54,068 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
2011-07-28 08:17:54,109 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:54,110 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:54,111 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:54,119 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054e, negotiated timeout = 60000
2011-07-28 08:17:54,123 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@20c3e967; hsa=aws03.machine.com:60020
2011-07-28 08:17:54,140 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,142 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:54,148 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
2011-07-28 08:17:54,154 [Thread-13] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:54,158 [Thread-13-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:54,159 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:54,161 [Thread-13-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054f, negotiated timeout = 60000
2011-07-28 08:17:54,164 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5ee771f3; hsa=aws03.machine.com:60020
2011-07-28 08:17:54,167 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,169 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:54,172 [Thread-13] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is aws03.machine.com:60020
2011-07-28 08:17:54,173 [Thread-13] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
2011-07-28 08:17:54,180 [Thread-13] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
2011-07-28 08:17:54,180 [Thread-13] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0314
2011-07-28 08:17:55,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0314
2011-07-28 08:19:06,924 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 8% complete
2011-07-28 08:19:15,971 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 16% complete
2011-07-28 08:19:18,985 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 19% complete
2011-07-28 08:19:25,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 27% complete
2011-07-28 08:20:14,810 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2011-07-28 08:20:14,812 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-07-28 08:20:14,830 [main] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:20:14,831 [main-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:20:14,832 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:20:14,838 [main-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60556, negotiated timeout = 60000
2011-07-28 08:20:14,842 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5d4fa79d; hsa=aws03.machine.com:60020
2011-07-28 08:20:14,847 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:20:14,849 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:20:14,852 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
2011-07-28 08:20:16,311 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=198330658
2011-07-28 08:20:16,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2011-07-28 08:20:16,341 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2011-07-28 08:20:16,656 [Thread-32] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2011-07-28 08:20:16,693 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:20:16,694 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:20:16,695 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:20:16,702 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60557, negotiated timeout = 60000
2011-07-28 08:20:16,705 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@37285252; hsa=aws03.machine.com:60020
2011-07-28 08:20:16,709 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:20:16,710 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:20:16,714 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
2011-07-28 08:20:16,716 [Thread-32] INFO  org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=aws09.machine.com:2222 sessionTimeout=60000 watcher=hconnection
2011-07-28 08:20:16,717 [Thread-32-SendThread()] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:20:16,718 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Socket connection established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:20:16,720 [Thread-32-SendThread(aws09.machine.com:2222)] INFO  org.apache.zookeeper.ClientCnxn - Session establishment complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60558, negotiated timeout = 60000
2011-07-28 08:20:16,723 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7418e252; hsa=aws03.machine.com:60020
2011-07-28 08:20:16,726 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:20:16,727 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:20:16,730 [Thread-32] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is aws03.machine.com:60020
2011-07-28 08:20:16,732 [Thread-32] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
2011-07-28 08:20:16,772 [Thread-32] DEBUG org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0 -> aws03.machine.com:,
2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
2011-07-28 08:20:16,772 [Thread-32] INFO  org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0315
2011-07-28 08:20:17,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0315
2011-07-28 08:20:28,075 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 37% complete
2011-07-28 08:20:34,106 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 41% complete
2011-07-28 08:20:37,124 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2011-07-28 08:20:46,168 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 51% complete
2011-07-28 08:20:49,183 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 61% complete
2011-07-28 08:20:52,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 62% complete
2011-07-28 08:20:55,214 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 64% complete
2011-07-28 08:21:01,244 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 66% complete
2011-07-28 08:21:07,311 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2011-07-28 08:21:07,312 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-07-28 08:21:08,770 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2011-07-28 08:21:08,778 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2011-07-28 08:21:08,910 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2011-07-28 08:21:08,911 [Thread-47] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107251336_0316
2011-07-28 08:21:09,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0316
2011-07-28 08:21:16,321 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 83% complete
2011-07-28 08:21:34,439 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2011-07-28 08:21:34,441 [main] INFO  org.apache.pig.tools.pigstats.PigStats - Script Statistics:

HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt    Features
0.20-append    0.8.1-SNAPSHOT    root    2011-07-28 08:17:51    2011-07-28 08:21:34    HASH_JOIN,FILTER,LIMIT

Success!

Job Stats (time in seconds):
JobId    Maps    Reduces    MaxMapTime    MinMapTIme    AvgMapTime    MaxReduceTime    MinReduceTime    AvgReduceTime    Alias    Feature    Outputs
job_201107251336_0314    2    1    75    66    70    63    63    63    end_sessions,sessions,start_sessions    HASH_JOIN
job_201107251336_0315    4    1    15    6    12    24    24    24    infos,sessions    HASH_JOIN
job_201107251336_0316    1    1    3    3    3    12    12    12            hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740,

Input(s):
Successfully read 2069446 records from: "endSession.mde253811.preprod.ubithere.com"
Successfully read 2072419 records from: "startSession.mde253811.preprod.ubithere.com"
Successfully read 19441 records from: "info.mde253811.preprod.ubithere.com"

Output(s):
Successfully stored 0 records in: "hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 1
Total records proactively spilled: 1944943

Job DAG:
job_201107251336_0314    ->    job_201107251336_0315,
job_201107251336_0315    ->    job_201107251336_0316,
job_201107251336_0316


2011-07-28 08:21:34,472 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2011-07-28 08:21:34,500 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2011-07-28 08:21:34,501 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
grunt>


> Problem with HBase loader 0.90.3 and PIG 0.8.1
> ----------------------------------------------
>
>                 Key: PIG-2193
>                 URL: https://issues.apache.org/jira/browse/PIG-2193
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: HBase 0.90.3, Hadoop 0.20-append
>            Reporter: Vincent BARAT
>         Attachments: PIG-2193.patch
>
>
> I've some data in HBase 0.90.3 and I run a simple script on them.
> This script badly returns 0 records. From time to time, under yet undefined conditions, the same script on the same data works (it return correct data).
> When data are loaded from HDFS instead of HBase, the script runs perfectly.
> Here is the script loading from HDFS (works): 
> start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> The same script loading from HBase (don't work):
> start_sessions = LOAD 'startSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> infos = LOAD 'info' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType data:networkSubtype data:locale data:applicationVersionName data:carrierCountry data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion data:firmwareName') AS (infoid:chararray, network_type:chararray, network_subtype:chararray, locale:chararray, version_name:chararray, carrier_country:chararray, carrier_name:chararray, phone_manufacturer:chararray, phone_model:chararray, firmware_version:chararray, firmware_name:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = JOIN sessions BY infoid, infos BY infoid;
> sessions = LIMIT sessions 100;
> dump sessions;
> I guess it definitively means there is a nasty bug in the HBase loader.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira