You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Vincent Barat <vi...@gmail.com> on 2011/07/27 14:38:44 UTC

Blocking issue with HBase 0.90.3 and PIG 0.8.1

More info on this issue:

1- I use PIG 0.8.1 and HBase 0.90.3 and Hadoop 0.20-append
2- The issue can be reproduced with PIG trunk too

The script:

start_sessions = LOAD 'startSession.mde253811.preprod.ubithere.com' 
USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid 
meta:infoid meta:imei meta:timestamp') AS (sid:chararray, 
infoid:chararray, imei:chararray, start:long);
end_sessions = LOAD 'endSession.mde253811.preprod.ubithere.com' 
USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid 
meta:timestamp meta:locid') AS (sid:chararray, end:long, 
locid:chararray);
sessions = JOIN start_sessions BY sid, end_sessions BY sid;
sessions = FILTER sessions BY end > start AND end - start < 86400000L;
sessions = FOREACH sessions GENERATE start_sessions::sid, imei, 
start, end;
sessions = LIMIT sessions 100;
dump sessions;
<output 1>
dump sessions;
<output 2>

The issue:

<output 1> is empty
<output 2> is 100 lines

I can reproduce the issue systematically.

Please advice: this issue prevent me from moving to HBase 0.90.3 in 
production, as I need to upgrade to PIG 0.8.1 at the same time !


Re: Blocking issue with HBase 0.90.3 and PIG 0.8.1

Posted by Vincent Barat <vb...@ubikod.com>.
A precision: HBase classes of the PIG trunk cannot be compiled 
inside PIG 0.8.1, so I was enable to test if a fix was introduced in 
the last version of these classes.
So 2- must not be taken into account

Le 27/07/11 14:38, Vincent Barat a écrit :
> More info on this issue:
>
> 1- I use PIG 0.8.1 and HBase 0.90.3 and Hadoop 0.20-append
> 2- The issue can be reproduced with PIG trunk too
>
> The script:
>
> start_sessions = LOAD 
> 'startSession.mde253811.preprod.ubithere.com' USING 
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid 
> meta:infoid meta:imei meta:timestamp') AS (sid:chararray, 
> infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession.mde253811.preprod.ubithere.com' 
> USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid 
> meta:timestamp meta:locid') AS (sid:chararray, end:long, 
> locid:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 
> 86400000L;
> sessions = FOREACH sessions GENERATE start_sessions::sid, imei, 
> start, end;
> sessions = LIMIT sessions 100;
> dump sessions;
> <output 1>
> dump sessions;
> <output 2>
>
> The issue:
>
> <output 1> is empty
> <output 2> is 100 lines
>
> I can reproduce the issue systematically.
>
> Please advice: this issue prevent me from moving to HBase 0.90.3 
> in production, as I need to upgrade to PIG 0.8.1 at the same time !
>
>

-- 

*Vincent BARAT, UBIKOD, CTO*


vbarat@ubikod.com <ma...@ubikod.com>  Mob +33 (0)6 15 41 15 18

UBIKOD Paris, c/o ESSEC VENTURES, Avenue Bernard Hirsch, 95021 
Cergy-Pontoise cedex, FRANCE, Tel +33 (0)1 34 43 28 89

UBIKOD Rennes, 10 rue Duhamel, 35000 Rennes, FRANCE, Tel. +33 (0)2 
99 65 69 13


www.ubikod.com <http://www.ubikod.com/>@ubikod 
<http://twitter.com/ubikod>

www.capptain.com <http://www.capptain.com/>@capptain_hq 
<http://twitter.com/capptain_hq>


IMPORTANT NOTICE -- UBIKOD and CAPPTAIN are registered trademarks of 
UBIKOD S.A.R.L., all copyrights are reserved.  The contents of this 
email and attachments are confidential and may be subject to legal 
privilege and/or protected by copyright. Copying or communicating 
any part of it to others is prohibited and may be unlawful. If you 
are not the intended recipient you must not use, copy, distribute or 
rely on this email and should please return it immediately or notify 
us by telephone. At present the integrity of email across the 
Internet cannot be guaranteed. Therefore UBIKOD S.A.R.L. will not 
accept liability for any claims arising as a result of the use of 
this medium for transmissions by or to UBIKOD S.A.R.L.. UBIKOD 
S.A.R.L. may exercise any of its rights under relevant law, to 
monitor the content of all electronic communications. You should 
therefore be aware that this communication and any responses might 
have been monitored, and may be accessed by UBIKOD S.A.R.L. The 
views expressed in this document are that of the individual and may 
not necessarily constitute or imply its endorsement or 
recommendation by UBIKOD S.A.R.L. The content of this electronic 
mail may be subject to the confidentiality terms of a 
"Non-Disclosure Agreement" (NDA).


Re: Blocking issue with HBase 0.90.3 and PIG 0.8.1

Posted by Vincent Barat <vi...@gmail.com>.
Yes: if I remove the FILTER or the JOIN clause, the loading of data 
works fine and consistently.
I will do more testings, but yes, I suspect HBase loader to work 
incorrectly in my case...

The same query works perfectly with HBase 0.20.6 and PIG 0.6.1.

Le 27/07/11 19:43, Thejas Nair a écrit :
> I looked at the query plan for the query using explain, and it 
> looks correct.
> As you said, this is a simple use case, I would be very surprised 
> if there is a optimizer bug here.
> I suspect that something is wrong in loading the data from hbase. 
> Are you able to get a simple load-store script working consistently ?
>
> Thanks,
> Thejas
>
>
> On 7/27/11 7:31 AM, Vincent Barat wrote:
>> I built the pig trunk with hbase 0.90.3 client lib (ant
>> -Dhbase.version=0.90.3) and the issue is still here.
>>
>> It makes me thing about an issue in the optimizer... Anyway the 
>> fact is
>> that my request is not complex, so I wonder how such an issue can go
>> through PIG test suite !
>>
>> Any help ?
>>
>> Le 27/07/11 14:38, Vincent Barat a écrit :
>>> More info on this issue:
>>>
>>> 1- I use PIG 0.8.1 and HBase 0.90.3 and Hadoop 0.20-append
>>> 2- The issue can be reproduced with PIG trunk too
>>>
>>> The script:
>>>
>>> start_sessions = LOAD 'startSession.mde253811.preprod.ubithere.com'
>>> USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid
>>> meta:infoid meta:imei meta:timestamp') AS (sid:chararray,
>>> infoid:chararray, imei:chararray, start:long);
>>> end_sessions = LOAD 'endSession.mde253811.preprod.ubithere.com' 
>>> USING
>>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid
>>> meta:timestamp meta:locid') AS (sid:chararray, end:long,
>>> locid:chararray);
>>> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
>>> sessions = FILTER sessions BY end > start AND end - start < 
>>> 86400000L;
>>> sessions = FOREACH sessions GENERATE start_sessions::sid, imei, 
>>> start,
>>> end;
>>> sessions = LIMIT sessions 100;
>>> dump sessions;
>>> <output 1>
>>> dump sessions;
>>> <output 2>
>>>
>>> The issue:
>>>
>>> <output 1> is empty
>>> <output 2> is 100 lines
>>>
>>> I can reproduce the issue systematically.
>>>
>>> Please advice: this issue prevent me from moving to HBase 0.90.3 in
>>> production, as I need to upgrade to PIG 0.8.1 at the same time !
>>>
>>>
>
>

Re: Blocking issue with HBase 0.90.3 and PIG 0.8.1

Posted by Thejas Nair <th...@hortonworks.com>.
I looked at the query plan for the query using explain, and it looks 
correct.
As you said, this is a simple use case, I would be very surprised if 
there is a optimizer bug here.
I suspect that something is wrong in loading the data from hbase. Are 
you able to get a simple load-store script working consistently ?

Thanks,
Thejas


On 7/27/11 7:31 AM, Vincent Barat wrote:
> I built the pig trunk with hbase 0.90.3 client lib (ant
> -Dhbase.version=0.90.3) and the issue is still here.
>
> It makes me thing about an issue in the optimizer... Anyway the fact is
> that my request is not complex, so I wonder how such an issue can go
> through PIG test suite !
>
> Any help ?
>
> Le 27/07/11 14:38, Vincent Barat a écrit :
>> More info on this issue:
>>
>> 1- I use PIG 0.8.1 and HBase 0.90.3 and Hadoop 0.20-append
>> 2- The issue can be reproduced with PIG trunk too
>>
>> The script:
>>
>> start_sessions = LOAD 'startSession.mde253811.preprod.ubithere.com'
>> USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid
>> meta:infoid meta:imei meta:timestamp') AS (sid:chararray,
>> infoid:chararray, imei:chararray, start:long);
>> end_sessions = LOAD 'endSession.mde253811.preprod.ubithere.com' USING
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid
>> meta:timestamp meta:locid') AS (sid:chararray, end:long,
>> locid:chararray);
>> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
>> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
>> sessions = FOREACH sessions GENERATE start_sessions::sid, imei, start,
>> end;
>> sessions = LIMIT sessions 100;
>> dump sessions;
>> <output 1>
>> dump sessions;
>> <output 2>
>>
>> The issue:
>>
>> <output 1> is empty
>> <output 2> is 100 lines
>>
>> I can reproduce the issue systematically.
>>
>> Please advice: this issue prevent me from moving to HBase 0.90.3 in
>> production, as I need to upgrade to PIG 0.8.1 at the same time !
>>
>>


Re: Blocking issue with HBase 0.90.3 and PIG 0.8.1

Posted by Vincent Barat <vi...@gmail.com>.
I built the pig trunk with hbase 0.90.3 client lib (ant 
-Dhbase.version=0.90.3) and the issue is still here.

It makes me thing about an issue in the optimizer... Anyway the fact 
is that my request is not complex, so I wonder how such an issue can 
go through PIG test suite !

Any help ?

Le 27/07/11 14:38, Vincent Barat a écrit :
> More info on this issue:
>
> 1- I use PIG 0.8.1 and HBase 0.90.3 and Hadoop 0.20-append
> 2- The issue can be reproduced with PIG trunk too
>
> The script:
>
> start_sessions = LOAD 
> 'startSession.mde253811.preprod.ubithere.com' USING 
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid 
> meta:infoid meta:imei meta:timestamp') AS (sid:chararray, 
> infoid:chararray, imei:chararray, start:long);
> end_sessions = LOAD 'endSession.mde253811.preprod.ubithere.com' 
> USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid 
> meta:timestamp meta:locid') AS (sid:chararray, end:long, 
> locid:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 
> 86400000L;
> sessions = FOREACH sessions GENERATE start_sessions::sid, imei, 
> start, end;
> sessions = LIMIT sessions 100;
> dump sessions;
> <output 1>
> dump sessions;
> <output 2>
>
> The issue:
>
> <output 1> is empty
> <output 2> is 100 lines
>
> I can reproduce the issue systematically.
>
> Please advice: this issue prevent me from moving to HBase 0.90.3 
> in production, as I need to upgrade to PIG 0.8.1 at the same time !
>
>

Re: Blocking issue with HBase 0.90.3 and PIG 0.8.1

Posted by Vincent Barat <vi...@gmail.com>.
The behavior is not random.
The first dump is always empty, and the second always works.
I will try what you ask, and if I have more details, I will create a 
JIRA issue.

Thanks.

Le 27/07/11 16:59, Raghu Angadi a écrit :
> Vincent,
>
> is the behavior random or the same each time?
>
> Couple of things to narrow it down..
>    - attach the entire console output from PIG run when this happened.
>    - only load start_sessions and end_sessions and store them..
>    - load the data from tables from previous step and run the same pig
> command
>
> Consider filing a JIRA. it might be a better place to go into more details.
>
> -Raghu.
>
> On Wed, Jul 27, 2011 at 5:38 AM, Vincent Barat<vi...@gmail.com>wrote:
>
>> More info on this issue:
>>
>> 1- I use PIG 0.8.1 and HBase 0.90.3 and Hadoop 0.20-append
>> 2- The issue can be reproduced with PIG trunk too
>>
>> The script:
>>
>> start_sessions = LOAD 'startSession.mde253811.**preprod.ubithere.com<http://startSession.mde253811.preprod.ubithere.com>'
>> USING org.apache.pig.backend.hadoop.**hbase.HBaseStorage('meta:sid
>> meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray,
>> imei:chararray, start:long);
>> end_sessions = LOAD 'endSession.mde253811.preprod.**ubithere.com<http://endSession.mde253811.preprod.ubithere.com>'
>> USING org.apache.pig.backend.hadoop.**hbase.HBaseStorage('meta:sid
>> meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
>> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
>> sessions = FILTER sessions BY end>  start AND end - start<  86400000L;
>> sessions = FOREACH sessions GENERATE start_sessions::sid, imei, start, end;
>> sessions = LIMIT sessions 100;
>> dump sessions;
>> <output 1>
>> dump sessions;
>> <output 2>
>>
>> The issue:
>>
>> <output 1>  is empty
>> <output 2>  is 100 lines
>>
>> I can reproduce the issue systematically.
>>
>> Please advice: this issue prevent me from moving to HBase 0.90.3 in
>> production, as I need to upgrade to PIG 0.8.1 at the same time !
>>
>>

Re: Blocking issue with HBase 0.90.3 and PIG 0.8.1

Posted by Vincent Barat <vi...@gmail.com>.
I've reported the issue here:  
https://issues.apache.org/jira/browse/PIG-2193

Still investigating, but seems so far that the FILTER clause makes 
the HBase loader loose all fields that are not explicitly used in 
the script

I striped down the request to:

start_sessions = LOAD 'startSession.mde253811.preprod.ubithere.com' 
USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid 
meta:infoid meta:imei meta:timestamp') AS (sid:chararray, 
infoid:chararray, imei:chararray, start:long);
end_sessions = LOAD 'endSession.mde253811.preprod.ubithere.com' 
USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid 
meta:timestamp meta:locid') AS (sid:chararray, end:long, 
locid:chararray);
sessions = JOIN start_sessions BY sid, end_sessions BY sid;
sessions = FILTER sessions BY end > start AND end - start < 86400000L;
dump sessions;

and in the result, the fields "infoid", "imei" and "locid" are 
empty, whereas the fields "sid", "start", "stop" are present.

(00000A2A33254B8FAE1E9AEAB2428EBE,,,1310649832970,00000A2A33254B8FAE1E9AEAB2428EBE,1310649838390,) 
(00001DCECDC842C0A745C151B9EC295F,,,1310628836846,00001DCECDC842C0A745C151B9EC295F,1310628839075,) 
(00001F8F2B3148D393963928188C72B6,,,1310681918742,00001F8F2B3148D393963928188C72B6,1310681949182,)
...

When using the HDFS loader, everything works correctly:

(00000A2A33254B8FAE1E9AEAB2428EBE,b87ac86bcf1d4cb44202aa826554a7b2,4e77d62e1839a470ec8386d42b85a076,1310649832970,00000A2A33254B8FAE1E9AEAB2428EBE,1310649838390,) 
(00001DCECDC842C0A745C151B9EC295F,4a4bb0fff26e368c8209f1e480fdf70b,db3924d2e4b88bd103fa19aaa30a9af4,1310628836846,00001DCECDC842C0A745C151B9EC295F,1310628839075,) 
(00001F8F2B3148D393963928188C72B6,5d7e58f68366b55d55862815f863a996,79e1ba90aa555e3e1041df4be657a11d,1310681918742,00001F8F2B3148D393963928188C72B6,1310681949182,) 

...






Re: Blocking issue with HBase 0.90.3 and PIG 0.8.1

Posted by Vincent Barat <vi...@gmail.com>.
So, I've tried the exact same request but loading the data from HDFS 
files (using the regular Pig loader) : it works !

Here is the request loading from HDFS:

start_sessions = LOAD 'start_sessions' AS (sid:chararray, 
infoid:chararray, imei:chararray, start:long);
end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long, 
locid:chararray);
infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray, 
network_subtype:chararray, locale:chararray, version_name:chararray, 
carrier_country:chararray, carrier_name:chararray, 
phone_manufacturer:chararray, phone_model:chararray, 
firmware_version:chararray, firmware_name:chararray);
sessions = JOIN start_sessions BY sid, end_sessions BY sid;
sessions = FILTER sessions BY end > start AND end - start < 86400000L;
sessions = JOIN sessions BY infoid, infos BY infoid;
sessions = LIMIT sessions 100;
dump sessions;

The same request loading from HBase don't work:

start_sessions = LOAD 'startSession' USING 
org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid 
meta:infoid meta:imei meta:timestamp') AS (sid:chararray, 
infoid:chararray, imei:chararray, start:long);
end_sessions = LOAD 'endSession' USING 
org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid 
meta:timestamp meta:locid') AS (sid:chararray, end:long, 
locid:chararray);
infos = LOAD 'info' USING 
org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid 
data:networkType data:networkSubtype data:locale 
data:applicationVersionName data:carrierCountry data:carrierName 
data:phoneManufacturer data:phoneModel data:firmwareVersion 
data:firmwareName') AS (infoid:chararray, network_type:chararray, 
network_subtype:chararray, locale:chararray, version_name:chararray, 
carrier_country:chararray, carrier_name:chararray, 
phone_manufacturer:chararray, phone_model:chararray, 
firmware_version:chararray, firmware_name:chararray);
sessions = JOIN start_sessions BY sid, end_sessions BY sid;
sessions = FILTER sessions BY end > start AND end - start < 86400000L;
sessions = JOIN sessions BY infoid, infos BY infoid;
sessions = LIMIT sessions 100;
dump sessions;

I guess it definitively means there is a nasty bug in the HBase loader.

Here is the PIG dump for the non working request:

aws09:~# pig
2011-07-28 08:17:36,329 [main] INFO  org.apache.pig.Main - Logging 
error messages to: /root/pig_1311841056328.log
2011-07-28 08:17:36,641 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - 
Connecting to hadoop file system at: 
hdfs://aws09.preprod.ubithere.com:9000
2011-07-28 08:17:36,923 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - 
Connecting to map-reduce job tracker at: aws09.preprod.ubithere.com:9001
grunt> start_sessions = LOAD 
'startSession.mde253811.preprod.ubithere.com' USING 
org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid 
meta:infoid meta:imei meta:timestamp') AS (sid:chararray, 
infoid:chararray, imei:chararray, start:long);
grunt> end_sessions = LOAD 
'endSession.mde253811.preprod.ubithere.com' USING 
org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid 
meta:timestamp meta:locid') AS (sid:chararray, end:long, 
locid:chararray);
grunt> infos = LOAD 'info.mde253811.preprod.ubithere.com' USING 
org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid 
data:networkType data:networkSubtype data:locale 
data:applicationVersionName data:carrierCountry data:carrierName 
data:phoneManufacturer data:phoneModel data:firmwareVersion 
data:firmwareName') AS (infoid:chararray, network_type:chararray, 
network_subtype:chararray, locale:chararray, version_name:chararray, 
carrier_country:chararray, carrier_name:chararray, 
phone_manufacturer:chararray, phone_model:chararray, 
firmware_version:chararray, firmware_name:chararray);
grunt> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
grunt> sessions = FILTER sessions BY end > start AND end - start < 
86400000L;
grunt> sessions = JOIN sessions BY infoid, infos BY infoid;
grunt> sessions = LIMIT sessions 100;
grunt> dump sessions;
2011-07-28 08:17:50,275 [main] INFO  
org.apache.pig.tools.pigstats.ScriptState - Pig features used in the 
script: HASH_JOIN,FILTER,LIMIT
2011-07-28 08:17:50,275 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - 
pig.usenewlogicalplan is set to true. New logical plan will be used.
2011-07-28 08:17:51,213 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - 
(Name: sessions: 
Store(hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740:org.apache.pig.impl.io.InterStorage) 
- scope-93 Operator Key: scope-93)
2011-07-28 08:17:51,225 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler 
- File concatenation threshold: 100 optimistic? false
2011-07-28 08:17:51,281 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer 
- Rewrite: POPackage->POForEach to POJoinPackage
2011-07-28 08:17:51,281 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer 
- Rewrite: POPackage->POForEach to POJoinPackage
2011-07-28 08:17:51,350 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer 
- MR plan size before optimization: 3
2011-07-28 08:17:51,350 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer 
- MR plan size after optimization: 3
2011-07-28 08:17:51,402 [main] INFO  
org.apache.pig.tools.pigstats.ScriptState - Pig script settings are 
added to the job
2011-07-28 08:17:51,411 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler 
- mapred.job.reduce.markreset.buffer.percent is not set, set to 
default 0.3
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper 
- Client environment:zookeeper.version=3.3.2-1031432, built on 
11/05/2010 05:32 GMT
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper 
- Client environment:host.name=aws09.machine.com
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper 
- Client environment:java.version=1.6.0_22
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper 
- Client environment:java.vendor=Sun Microsystems Inc.
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper 
- Client environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.22/jre
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper 
- Client 
environment:java.class.path=/opt/pig/bin/../conf:/usr/lib/jvm/java-6-sun/jre/lib/tools.jar:/opt/pig/bin/../pig-0.8.1-core.jar:/opt/pig/bin/../build/pig-*-SNAPSHOT.jar:/opt/pig/bin/../lib/commons-el-1.0.jar:/opt/pig/bin/../lib/commons-lang-2.4.jar:/opt/pig/bin/../lib/commons-logging-1.1.1.jar:/opt/pig/bin/../lib/guava-r06.jar:/opt/pig/bin/../lib/hbase-0.90.3.jar:/opt/pig/bin/../lib/hsqldb-1.8.0.10.jar:/opt/pig/bin/../lib/jackson-core-asl-1.0.1.jar:/opt/pig/bin/../lib/jackson-mapper-asl-1.0.1.jar:/opt/pig/bin/../lib/javacc-4.2.jar:/opt/pig/bin/../lib/javacc.jar:/opt/pig/bin/../lib/jetty-util-6.1.14.jar:/opt/pig/bin/../lib/jline-0.9.94.jar:/opt/pig/bin/../lib/joda-time-1.6.jar:/opt/pig/bin/../lib/jsch-0.1.38.jar:/opt/pig/bin/../lib/junit-4.5.jar:/opt/pig/bin/../lib/jython-2.5.0.jar:/opt/pig/bin/../lib/log4j-1.2.14.jar:/opt/pig/bin/../lib/pigudfs.jar:/opt/pig/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/pig/bin/../lib/zookeeper-3.3.2.jar:/opt/hadoop/conf_computation:/opt/hbase/conf:/opt/pig/lib/hadoop-0.20-append-core.jar
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper 
- Client 
environment:java.library.path=/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64/server:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper 
- Client environment:java.io.tmpdir=/tmp
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper 
- Client environment:java.compiler=<NA>
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper 
- Client environment:os.name=Linux
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper 
- Client environment:os.arch=amd64
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper 
- Client environment:os.version=2.6.21.7-2.fc8xen-ec2-v1.0
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper 
- Client environment:user.name=root
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper 
- Client environment:user.home=/root
2011-07-28 08:17:51,470 [main] INFO  org.apache.zookeeper.ZooKeeper 
- Client environment:user.dir=/root
2011-07-28 08:17:51,471 [main] INFO  org.apache.zookeeper.ZooKeeper 
- Initiating client connection, connectString=aws09.machine.com:2222 
sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:51,493 [main-SendThread()] INFO  
org.apache.zookeeper.ClientCnxn - Opening socket connection to 
server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:51,499 [main-SendThread(aws09.machine.com:2222)] 
INFO  org.apache.zookeeper.ClientCnxn - Socket connection 
established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:51,508 [main-SendThread(aws09.machine.com:2222)] 
INFO  org.apache.zookeeper.ClientCnxn - Session establishment 
complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 
0x131617dada6054b, negotiated timeout = 60000
2011-07-28 08:17:51,575 [main] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Lookedup root region location, 
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@ef894ce; 
hsa=aws03.machine.com:60020
2011-07-28 08:17:51,687 [main] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:51,696 [main] DEBUG 
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. 
starting at 
row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for 
max=10 rows
2011-07-28 08:17:51,700 [main] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Cached location for 
endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 
is aws03.machine.com:60020
2011-07-28 08:17:51,726 [main] DEBUG 
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. 
starting at 
row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for 
max=10 rows
2011-07-28 08:17:51,729 [main] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Cached location for 
startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is 
aws03.machine.com:60020
2011-07-28 08:17:53,328 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler 
- Setting up single store job
2011-07-28 08:17:53,335 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler 
- BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=0
2011-07-28 08:17:53,335 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler 
- Neither PARALLEL nor default parallelism is set for this job. 
Setting number of reducers to 1
2011-07-28 08:17:53,442 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 1 map-reduce job(s) waiting for submission.
2011-07-28 08:17:53,944 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 0% complete
2011-07-28 08:17:53,989 [Thread-13] INFO  
org.apache.zookeeper.ZooKeeper - Initiating client connection, 
connectString=aws09.machine.com:2222 sessionTimeout=60000 
watcher=hconnection
2011-07-28 08:17:53,990 [Thread-13-SendThread()] INFO  
org.apache.zookeeper.ClientCnxn - Opening socket connection to 
server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:53,991 
[Thread-13-SendThread(aws09.machine.com:2222)] INFO  
org.apache.zookeeper.ClientCnxn - Socket connection established to 
aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:53,996 
[Thread-13-SendThread(aws09.machine.com:2222)] INFO  
org.apache.zookeeper.ClientCnxn - Session establishment complete on 
server aws09.machine.com/10.83.1.244:2222, sessionid = 
0x131617dada6054c, negotiated timeout = 60000
2011-07-28 08:17:54,000 [Thread-13] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Lookedup root region location, 
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2d9f90e3; 
hsa=aws03.machine.com:60020
2011-07-28 08:17:54,005 [Thread-13] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,006 [Thread-13] DEBUG 
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. 
starting at 
row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for 
max=10 rows
2011-07-28 08:17:54,011 [Thread-13] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Cached location for 
endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 
is aws03.machine.com:60020
2011-07-28 08:17:54,017 [Thread-13] INFO  
org.apache.zookeeper.ZooKeeper - Initiating client connection, 
connectString=aws09.machine.com:2222 sessionTimeout=60000 
watcher=hconnection
2011-07-28 08:17:54,017 [Thread-13-SendThread()] INFO  
org.apache.zookeeper.ClientCnxn - Opening socket connection to 
server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:54,018 
[Thread-13-SendThread(aws09.machine.com:2222)] INFO  
org.apache.zookeeper.ClientCnxn - Socket connection established to 
aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:54,025 
[Thread-13-SendThread(aws09.machine.com:2222)] INFO  
org.apache.zookeeper.ClientCnxn - Session establishment complete on 
server aws09.machine.com/10.83.1.244:2222, sessionid = 
0x131617dada6054d, negotiated timeout = 60000
2011-07-28 08:17:54,029 [Thread-13] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Lookedup root region location, 
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@41f6321; 
hsa=aws03.machine.com:60020
2011-07-28 08:17:54,032 [Thread-13] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,033 [Thread-13] DEBUG 
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. 
starting at 
row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for 
max=10 rows
2011-07-28 08:17:54,037 [Thread-13] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Cached location for 
endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 
is aws03.machine.com:60020
2011-07-28 08:17:54,039 [Thread-13] DEBUG 
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. 
starting at 
row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for 
max=2147483647 rows
2011-07-28 08:17:54,067 [Thread-13] DEBUG 
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: 
split -> 0 -> aws03.machine.com:,
2011-07-28 08:17:54,068 [Thread-13] INFO  
org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 
splits.
2011-07-28 08:17:54,068 [Thread-13] INFO  
org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - 
Returning 1 splits.
2011-07-28 08:17:54,109 [Thread-13] INFO  
org.apache.zookeeper.ZooKeeper - Initiating client connection, 
connectString=aws09.machine.com:2222 sessionTimeout=60000 
watcher=hconnection
2011-07-28 08:17:54,110 [Thread-13-SendThread()] INFO  
org.apache.zookeeper.ClientCnxn - Opening socket connection to 
server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:54,111 
[Thread-13-SendThread(aws09.machine.com:2222)] INFO  
org.apache.zookeeper.ClientCnxn - Socket connection established to 
aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:54,119 
[Thread-13-SendThread(aws09.machine.com:2222)] INFO  
org.apache.zookeeper.ClientCnxn - Session establishment complete on 
server aws09.machine.com/10.83.1.244:2222, sessionid = 
0x131617dada6054e, negotiated timeout = 60000
2011-07-28 08:17:54,123 [Thread-13] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Lookedup root region location, 
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@20c3e967; 
hsa=aws03.machine.com:60020
2011-07-28 08:17:54,140 [Thread-13] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,142 [Thread-13] DEBUG 
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. 
starting at 
row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for 
max=10 rows
2011-07-28 08:17:54,148 [Thread-13] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Cached location for 
startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is 
aws03.machine.com:60020
2011-07-28 08:17:54,154 [Thread-13] INFO  
org.apache.zookeeper.ZooKeeper - Initiating client connection, 
connectString=aws09.machine.com:2222 sessionTimeout=60000 
watcher=hconnection
2011-07-28 08:17:54,158 [Thread-13-SendThread()] INFO  
org.apache.zookeeper.ClientCnxn - Opening socket connection to 
server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:54,159 
[Thread-13-SendThread(aws09.machine.com:2222)] INFO  
org.apache.zookeeper.ClientCnxn - Socket connection established to 
aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:54,161 
[Thread-13-SendThread(aws09.machine.com:2222)] INFO  
org.apache.zookeeper.ClientCnxn - Session establishment complete on 
server aws09.machine.com/10.83.1.244:2222, sessionid = 
0x131617dada6054f, negotiated timeout = 60000
2011-07-28 08:17:54,164 [Thread-13] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Lookedup root region location, 
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5ee771f3; 
hsa=aws03.machine.com:60020
2011-07-28 08:17:54,167 [Thread-13] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,169 [Thread-13] DEBUG 
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. 
starting at 
row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for 
max=10 rows
2011-07-28 08:17:54,172 [Thread-13] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Cached location for 
startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is 
aws03.machine.com:60020
2011-07-28 08:17:54,173 [Thread-13] DEBUG 
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. 
starting at 
row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for 
max=2147483647 rows
2011-07-28 08:17:54,180 [Thread-13] DEBUG 
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: 
split -> 0 -> aws03.machine.com:,
2011-07-28 08:17:54,180 [Thread-13] INFO  
org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 
splits.
2011-07-28 08:17:54,180 [Thread-13] INFO  
org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - 
Returning 1 splits.
2011-07-28 08:17:55,037 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- HadoopJobId: job_201107251336_0314
2011-07-28 08:17:55,037 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- More information at: 
http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0314
2011-07-28 08:19:06,924 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 8% complete
2011-07-28 08:19:15,971 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 16% complete
2011-07-28 08:19:18,985 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 19% complete
2011-07-28 08:19:25,035 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 27% complete
2011-07-28 08:20:14,810 [main] INFO  
org.apache.pig.tools.pigstats.ScriptState - Pig script settings are 
added to the job
2011-07-28 08:20:14,812 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler 
- mapred.job.reduce.markreset.buffer.percent is not set, set to 
default 0.3
2011-07-28 08:20:14,830 [main] INFO  org.apache.zookeeper.ZooKeeper 
- Initiating client connection, connectString=aws09.machine.com:2222 
sessionTimeout=60000 watcher=hconnection
2011-07-28 08:20:14,831 [main-SendThread()] INFO  
org.apache.zookeeper.ClientCnxn - Opening socket connection to 
server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:20:14,832 [main-SendThread(aws09.machine.com:2222)] 
INFO  org.apache.zookeeper.ClientCnxn - Socket connection 
established to aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:20:14,838 [main-SendThread(aws09.machine.com:2222)] 
INFO  org.apache.zookeeper.ClientCnxn - Session establishment 
complete on server aws09.machine.com/10.83.1.244:2222, sessionid = 
0x131617dada60556, negotiated timeout = 60000
2011-07-28 08:20:14,842 [main] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Lookedup root region location, 
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5d4fa79d; 
hsa=aws03.machine.com:60020
2011-07-28 08:20:14,847 [main] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:20:14,849 [main] DEBUG 
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. 
starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 
for max=10 rows
2011-07-28 08:20:14,852 [main] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Cached location for 
info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is 
aws03.machine.com:60020
2011-07-28 08:20:16,311 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler 
- Setting up single store job
2011-07-28 08:20:16,324 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler 
- BytesPerReducer=1000000000 maxReducers=999 
totalInputFileSize=198330658
2011-07-28 08:20:16,324 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler 
- Neither PARALLEL nor default parallelism is set for this job. 
Setting number of reducers to 1
2011-07-28 08:20:16,341 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 1 map-reduce job(s) waiting for submission.
2011-07-28 08:20:16,656 [Thread-32] INFO  
org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input 
paths to process : 1
2011-07-28 08:20:16,656 [Thread-32] INFO  
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - 
Total input paths to process : 1
2011-07-28 08:20:16,693 [Thread-32] INFO  
org.apache.zookeeper.ZooKeeper - Initiating client connection, 
connectString=aws09.machine.com:2222 sessionTimeout=60000 
watcher=hconnection
2011-07-28 08:20:16,694 [Thread-32-SendThread()] INFO  
org.apache.zookeeper.ClientCnxn - Opening socket connection to 
server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:20:16,695 
[Thread-32-SendThread(aws09.machine.com:2222)] INFO  
org.apache.zookeeper.ClientCnxn - Socket connection established to 
aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:20:16,702 
[Thread-32-SendThread(aws09.machine.com:2222)] INFO  
org.apache.zookeeper.ClientCnxn - Session establishment complete on 
server aws09.machine.com/10.83.1.244:2222, sessionid = 
0x131617dada60557, negotiated timeout = 60000
2011-07-28 08:20:16,705 [Thread-32] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Lookedup root region location, 
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@37285252; 
hsa=aws03.machine.com:60020
2011-07-28 08:20:16,709 [Thread-32] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:20:16,710 [Thread-32] DEBUG 
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. 
starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 
for max=10 rows
2011-07-28 08:20:16,714 [Thread-32] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Cached location for 
info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is 
aws03.machine.com:60020
2011-07-28 08:20:16,716 [Thread-32] INFO  
org.apache.zookeeper.ZooKeeper - Initiating client connection, 
connectString=aws09.machine.com:2222 sessionTimeout=60000 
watcher=hconnection
2011-07-28 08:20:16,717 [Thread-32-SendThread()] INFO  
org.apache.zookeeper.ClientCnxn - Opening socket connection to 
server aws09.machine.com/10.83.1.244:2222
2011-07-28 08:20:16,718 
[Thread-32-SendThread(aws09.machine.com:2222)] INFO  
org.apache.zookeeper.ClientCnxn - Socket connection established to 
aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:20:16,720 
[Thread-32-SendThread(aws09.machine.com:2222)] INFO  
org.apache.zookeeper.ClientCnxn - Session establishment complete on 
server aws09.machine.com/10.83.1.244:2222, sessionid = 
0x131617dada60558, negotiated timeout = 60000
2011-07-28 08:20:16,723 [Thread-32] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Lookedup root region location, 
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7418e252; 
hsa=aws03.machine.com:60020
2011-07-28 08:20:16,726 [Thread-32] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:20:16,727 [Thread-32] DEBUG 
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. 
starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 
for max=10 rows
2011-07-28 08:20:16,730 [Thread-32] DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation 
- Cached location for 
info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is 
aws03.machine.com:60020
2011-07-28 08:20:16,732 [Thread-32] DEBUG 
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. 
starting at row=info.mde253811.preprod.ubithere.com,,00000000000000 
for max=2147483647 rows
2011-07-28 08:20:16,772 [Thread-32] DEBUG 
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: 
split -> 0 -> aws03.machine.com:,
2011-07-28 08:20:16,772 [Thread-32] INFO  
org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 
splits.
2011-07-28 08:20:16,772 [Thread-32] INFO  
org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - 
Returning 1 splits.
2011-07-28 08:20:17,500 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- HadoopJobId: job_201107251336_0315
2011-07-28 08:20:17,500 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- More information at: 
http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0315
2011-07-28 08:20:28,075 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 37% complete
2011-07-28 08:20:34,106 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 41% complete
2011-07-28 08:20:37,124 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 50% complete
2011-07-28 08:20:46,168 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 51% complete
2011-07-28 08:20:49,183 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 61% complete
2011-07-28 08:20:52,198 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 62% complete
2011-07-28 08:20:55,214 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 64% complete
2011-07-28 08:21:01,244 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 66% complete
2011-07-28 08:21:07,311 [main] INFO  
org.apache.pig.tools.pigstats.ScriptState - Pig script settings are 
added to the job
2011-07-28 08:21:07,312 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler 
- mapred.job.reduce.markreset.buffer.percent is not set, set to 
default 0.3
2011-07-28 08:21:08,770 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler 
- Setting up single store job
2011-07-28 08:21:08,778 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 1 map-reduce job(s) waiting for submission.
2011-07-28 08:21:08,910 [Thread-47] INFO  
org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input 
paths to process : 1
2011-07-28 08:21:08,910 [Thread-47] INFO  
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - 
Total input paths to process : 1
2011-07-28 08:21:08,911 [Thread-47] INFO  
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - 
Total input paths (combined) to process : 1
2011-07-28 08:21:09,280 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- HadoopJobId: job_201107251336_0316
2011-07-28 08:21:09,280 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- More information at: 
http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0316
2011-07-28 08:21:16,321 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 83% complete
2011-07-28 08:21:34,439 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 100% complete
2011-07-28 08:21:34,441 [main] INFO  
org.apache.pig.tools.pigstats.PigStats - Script Statistics:

HadoopVersion    PigVersion    UserId    StartedAt    FinishedAt    
Features
0.20-append    0.8.1-SNAPSHOT    root    2011-07-28 08:17:51    
2011-07-28 08:21:34    HASH_JOIN,FILTER,LIMIT

Success!

Job Stats (time in seconds):
JobId    Maps    Reduces    MaxMapTime    MinMapTIme    
AvgMapTime    MaxReduceTime    MinReduceTime    AvgReduceTime    
Alias    Feature    Outputs
job_201107251336_0314    2    1    75    66    70    63    63    
63    end_sessions,sessions,start_sessions    HASH_JOIN
job_201107251336_0315    4    1    15    6    12    24    24    
24    infos,sessions    HASH_JOIN
job_201107251336_0316    1    1    3    3    3    12    12    12    
         
hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740,

Input(s):
Successfully read 2069446 records from: 
"endSession.mde253811.preprod.ubithere.com"
Successfully read 2072419 records from: 
"startSession.mde253811.preprod.ubithere.com"
Successfully read 19441 records from: 
"info.mde253811.preprod.ubithere.com"

Output(s):
Successfully stored 0 records in: 
"hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 1
Total records proactively spilled: 1944943

Job DAG:
job_201107251336_0314    ->    job_201107251336_0315,
job_201107251336_0315    ->    job_201107251336_0316,
job_201107251336_0316


2011-07-28 08:21:34,472 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- Success!
2011-07-28 08:21:34,500 [main] INFO  
org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input 
paths to process : 1
2011-07-28 08:21:34,501 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - 
Total input paths to process : 1
grunt>




Le 27/07/11 16:59, Raghu Angadi a écrit :
> Vincent,
>
> is the behavior random or the same each time?
>
> Couple of things to narrow it down..
>    - attach the entire console output from PIG run when this happened.
>    - only load start_sessions and end_sessions and store them..
>    - load the data from tables from previous step and run the same pig
> command
>
> Consider filing a JIRA. it might be a better place to go into more details.
>
> -Raghu.
>
> On Wed, Jul 27, 2011 at 5:38 AM, Vincent Barat<vi...@gmail.com>wrote:
>
>> More info on this issue:
>>
>> 1- I use PIG 0.8.1 and HBase 0.90.3 and Hadoop 0.20-append
>> 2- The issue can be reproduced with PIG trunk too
>>
>> The script:
>>
>> start_sessions = LOAD 'startSession.mde253811.**preprod.ubithere.com<http://startSession.mde253811.preprod.ubithere.com>'
>> USING org.apache.pig.backend.hadoop.**hbase.HBaseStorage('meta:sid
>> meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray,
>> imei:chararray, start:long);
>> end_sessions = LOAD 'endSession.mde253811.preprod.**ubithere.com<http://endSession.mde253811.preprod.ubithere.com>'
>> USING org.apache.pig.backend.hadoop.**hbase.HBaseStorage('meta:sid
>> meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
>> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
>> sessions = FILTER sessions BY end>  start AND end - start<  86400000L;
>> sessions = FOREACH sessions GENERATE start_sessions::sid, imei, start, end;
>> sessions = LIMIT sessions 100;
>> dump sessions;
>> <output 1>
>> dump sessions;
>> <output 2>
>>
>> The issue:
>>
>> <output 1>  is empty
>> <output 2>  is 100 lines
>>
>> I can reproduce the issue systematically.
>>
>> Please advice: this issue prevent me from moving to HBase 0.90.3 in
>> production, as I need to upgrade to PIG 0.8.1 at the same time !
>>
>>


Re: Blocking issue with HBase 0.90.3 and PIG 0.8.1

Posted by Raghu Angadi <an...@gmail.com>.
Vincent,

is the behavior random or the same each time?

Couple of things to narrow it down..
  - attach the entire console output from PIG run when this happened.
  - only load start_sessions and end_sessions and store them..
  - load the data from tables from previous step and run the same pig
command

Consider filing a JIRA. it might be a better place to go into more details.

-Raghu.

On Wed, Jul 27, 2011 at 5:38 AM, Vincent Barat <vi...@gmail.com>wrote:

> More info on this issue:
>
> 1- I use PIG 0.8.1 and HBase 0.90.3 and Hadoop 0.20-append
> 2- The issue can be reproduced with PIG trunk too
>
> The script:
>
> start_sessions = LOAD 'startSession.mde253811.**preprod.ubithere.com<http://startSession.mde253811.preprod.ubithere.com>'
> USING org.apache.pig.backend.hadoop.**hbase.HBaseStorage('meta:sid
> meta:infoid meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray,
> imei:chararray, start:long);
> end_sessions = LOAD 'endSession.mde253811.preprod.**ubithere.com<http://endSession.mde253811.preprod.ubithere.com>'
> USING org.apache.pig.backend.hadoop.**hbase.HBaseStorage('meta:sid
> meta:timestamp meta:locid') AS (sid:chararray, end:long, locid:chararray);
> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
> sessions = FOREACH sessions GENERATE start_sessions::sid, imei, start, end;
> sessions = LIMIT sessions 100;
> dump sessions;
> <output 1>
> dump sessions;
> <output 2>
>
> The issue:
>
> <output 1> is empty
> <output 2> is 100 lines
>
> I can reproduce the issue systematically.
>
> Please advice: this issue prevent me from moving to HBase 0.90.3 in
> production, as I need to upgrade to PIG 0.8.1 at the same time !
>
>