You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Pr...@itcinfotech.com on 2014/07/13 16:42:23 UTC

Support for Hadoop 2.2: Exception while loading xml file using XMLLoader

Hi Team,

I am getting attached exception while dumping jobs_data from following pig script:

jobs_data = load '/user/hadoop/input/job_sample2.xml' using org.apache.pig.piggybank.storage.XMLLoader('Job') as (doc:chararray);

loading other format file with option of USING PigStorage('') is working fine.

Thanks in advance for your help.

Regards,
Pravin

Please consider the environment before printing this e-mail

Disclaimer: This  communication  is  for the exclusive use of the intended recipient(s) and  shall  not attach any liability on the originator or ITC Infotech India Ltd./its  Holding company/ its Subsidiaries/ its Group Companies. If you are the addressee, the contents of this e-mail are intended for your use only and it shall  not be forwarded to any third party, without first obtaining written authorization from the originator or ITC Infotech India Ltd./ its Holding company/its  Subsidiaries/ its Group Companies. It may contain information which is confidential and legally privileged and the same shall not be used or dealt with  by any  third  party  in  any manner whatsoever without the specific consent  of  ITC  Infotech India Ltd./ its Holding company/ its Subsidiaries/ its Group Companies.

Re: Support for Hadoop 2.2: Exception while loading xml file using XMLLoader

Posted by Suraj Nayak M <sn...@gmail.com>.
Sandeep/Pravin,

Which version of Pig are you using ? Which hadoop version are you using ?

The error seems to be related to Yarn Backend. Where Application with id 
application_1405251610863_0007 doesn't exist in RM. However, I tried to 
reproduce in my laptop. Am using CHD4.6 locally on my laptop(in 
pseudo-distributed mode, running YARN). I did not face any problem.

Pig code :
REGISTER 
/usr/lib/pig/pig-0.11.0-cdh4.6.0/contrib/piggybank/java/piggybank.jar

jobs_data = load 'sample.xml' using 
org.apache.pig.piggybank.storage.XMLLoader('note') as (doc:chararray);

dump jobs_data;

sample.xml :
<note>
     <to>Tove</to>
     <from>Jani</from>
     <heading>Reminder</heading>
     <body>Do not forget me this weekend</body>
</note>


Output :
(<note>
     <to>Tove</to>
     <from>Jani</from>
     <heading>Reminder</heading>
     <body>Don't forget me this weekend!</body>
</note>)

--
Suraj Nayak

On Monday 14 July 2014 12:09 AM, Sandeep.KS@itcinfotech.com wrote:
> Hi Suraj,
> Below is the stack trace.
>
> Pig Stack Trace
> ---------------
> ERROR 1066: Unable to open iterator for alias jobs_data. Backend error : org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1405251610863_0007' doesn't exist in RM.
> 	at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:251)
> 	at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:120)
> 	at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
>
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias jobs_data. Backend error : org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1405251610863_0007' doesn't exist in RM.
> 	at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:251)
> 	at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:120)
> 	at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
>
> 	at org.apache.pig.PigServer.openIterator(PigServer.java:870)
> 	at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:774)
> 	at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372)
> 	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
> 	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
> 	at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
> 	at org.apache.pig.Main.run(Main.java:541)
> 	at org.apache.pig.Main.main(Main.java:156)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: java.io.IOException: org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1405251610863_0007' doesn't exist in RM.
> 	at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:251)
> 	at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:120)
> 	at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
>
> 	at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:348)
> 	at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:419)
> 	at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:524)
> 	at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:183)
> 	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:580)
> 	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:578)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> 	at org.apache.hadoop.mapred.JobClient.getJobUsingCluster(JobClient.java:578)
> 	at org.apache.hadoop.mapred.JobClient.getTaskReports(JobClient.java:633)
> 	at org.apache.hadoop.mapred.JobClient.getMapTaskReports(JobClient.java:627)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:150)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:428)
> 	at org.apache.pig.PigServer.launchPlan(PigServer.java:1322)
> 	at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1307)
> 	at org.apache.pig.PigServer.storeEx(PigServer.java:978)
> 	at org.apache.pig.PigServer.store(PigServer.java:942)
> 	at org.apache.pig.PigServer.openIterator(PigServer.java:855)
> 	... 12 more
> ================================================================================
>
> On 14-Jul-2014, at 12:04 am, "Suraj Nayak M" <snayakm@gmail.com 
> <ma...@gmail.com>> wrote:
>
>> Sandeep,
>>
>> There is no attachment in the email. Can you paste the error in the 
>> body if you are unable to attach it ?
>>
>> On Sunday 13 July 2014 08:12 PM, Pravin.Choudhary@itcinfotech.com wrote:
>>>
>>> Hi Team,
>>>
>>> I am getting attached exception while dumping jobs_data from 
>>> following pig script:
>>>
>>> jobs_data = load '/user/hadoop/input/job_sample2.xml' using 
>>> org.apache.pig.piggybank.storage.XMLLoader('Job') as (doc:chararray);
>>>
>>> loading other format file with option of USING PigStorage('') is 
>>> working fine.
>>>
>>> Thanks in advance for your help.
>>>
>>> Regards,
>>>
>>> Pravin
>>>
>>>
>>>
>>>
>>> Please consider the environment before printing this e-mail
>>>
>>> Disclaimer: This  communication  is  for the exclusive use of the intended recipient(s) and  shall  not attach any liability on the originator or ITC Infotech India Ltd./its  Holding company/ its Subsidiaries/ its Group Companies. If you are the addressee, the contents of this e-mail are intended for your use only and it shall  not be forwarded to any third party, without first obtaining written authorization from the originator or ITC Infotech India Ltd./ its Holding company/its  Subsidiaries/ its Group Companies. It may contain information&n 
>>> bsp;which is confidential and legally privileged and the same shall not be used or dealt with  by any  third  party  in  any manner whatsoever without the specific consent  of  ITC  Infotech India Ltd./ its Holding company/ its Subsidiaries/ its Group Companies. 
>>>
>>>
>>
>
>
>
> Please consider the environment before printing this e-mail
>
> Disclaimer: This  communication  is  for the exclusive use of the intended recipient(s) and  shall  not attach any liability on the originator or ITC Infotech India Ltd./its  Holding company/ its Subsidiaries/ its Group Companies. If you are the addressee, the contents of this e-mail are intended for your use only and it shall  not be forwarded to any third party, without first obtaining written authorization from the originator or ITC Infotech India Ltd./ its Holding company/its  Subsidiaries/ its Group Companies. It may contain information which is confidential and legally privileged and the same shall not be used or dealt with  by any  third  party  in  any manner whatsoever without the specific consent  of  ITC  Infotech India Ltd./ its Holding company/ its Subsidiaries/ its Group Companies. 
>
>


Re: Support for Hadoop 2.2: Exception while loading xml file using XMLLoader

Posted by Sa...@itcinfotech.com.
Hi Suraj,

Below is the stack trace.


Pig Stack Trace
---------------
ERROR 1066: Unable to open iterator for alias jobs_data. Backend error : org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1405251610863_0007' doesn't exist in RM.
        at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:251)
        at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:120)
        at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias jobs_data. Backend error : org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1405251610863_0007' doesn't exist in RM.
        at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:251)
        at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:120)
        at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)

        at org.apache.pig.PigServer.openIterator(PigServer.java:870)
        at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:774)
        at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
        at org.apache.pig.Main.run(Main.java:541)
        at org.apache.pig.Main.main(Main.java:156)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.io.IOException: org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1405251610863_0007' doesn't exist in RM.
        at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:251)
        at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:120)
        at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)

        at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:348)
        at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:419)
        at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:524)
        at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:183)
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:580)
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:578)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.mapred.JobClient.getJobUsingCluster(JobClient.java:578)
        at org.apache.hadoop.mapred.JobClient.getTaskReports(JobClient.java:633)
        at org.apache.hadoop.mapred.JobClient.getMapTaskReports(JobClient.java:627)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:150)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:428)
        at org.apache.pig.PigServer.launchPlan(PigServer.java:1322)
        at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1307)
        at org.apache.pig.PigServer.storeEx(PigServer.java:978)
        at org.apache.pig.PigServer.store(PigServer.java:942)
        at org.apache.pig.PigServer.openIterator(PigServer.java:855)
        ... 12 more
================================================================================

On 14-Jul-2014, at 12:04 am, "Suraj Nayak M" <sn...@gmail.com>> wrote:

Sandeep,

There is no attachment in the email. Can you paste the error in the body if you are unable to attach it ?

On Sunday 13 July 2014 08:12 PM, Pravin.Choudhary@itcinfotech.com<ma...@itcinfotech.com> wrote:
Hi Team,

I am getting attached exception while dumping jobs_data from following pig script:

jobs_data = load '/user/hadoop/input/job_sample2.xml' using org.apache.pig.piggybank.storage.XMLLoader('Job') as (doc:chararray);

loading other format file with option of USING PigStorage('') is working fine.

Thanks in advance for your help.

Regards,
Pravin




Please consider the environment before printing this e-mail

Disclaimer: This  communication  is  for the exclusive use of the intended recipient(s) and  shall  not attach any liability on the originator or ITC Infotech India Ltd./its  Holding company/ its Subsidiaries/ its Group Companies. If you are the addressee, the contents of this e-mail are intended for your use only and it shall  not be forwarded to any third party, without first obtaining written authorization from the originator or ITC Infotech India Ltd./ its Holding company/its  Subsidiaries/ its Group Companies. It may contain information&n bsp;which is confidential and legally privileged and the same shall not be used or dealt with  by any  third  party  in  any manner whatsoever without the specific consent  of  ITC  Infotech India Ltd./ its Holding company/ its Subsidiaries/ its Group Companies.

Please consider the environment before printing this e-mail

Disclaimer: This  communication  is  for the exclusive use of the intended recipient(s) and  shall  not attach any liability on the originator or ITC Infotech India Ltd./its  Holding company/ its Subsidiaries/ its Group Companies. If you are the addressee, the contents of this e-mail are intended for your use only and it shall  not be forwarded to any third party, without first obtaining written authorization from the originator or ITC Infotech India Ltd./ its Holding company/its  Subsidiaries/ its Group Companies. It may contain information which is confidential and legally privileged and the same shall not be used or dealt with  by any  third  party  in  any manner whatsoever without the specific consent  of  ITC  Infotech India Ltd./ its Holding company/ its Subsidiaries/ its Group Companies.

Re: Support for Hadoop 2.2: Exception while loading xml file using XMLLoader

Posted by Suraj Nayak M <sn...@gmail.com>.
Sandeep,

There is no attachment in the email. Can you paste the error in the body 
if you are unable to attach it ?

On Sunday 13 July 2014 08:12 PM, Pravin.Choudhary@itcinfotech.com wrote:
>
> Hi Team,
>
> I am getting attached exception while dumping jobs_data from following 
> pig script:
>
> jobs_data = load '/user/hadoop/input/job_sample2.xml' using 
> org.apache.pig.piggybank.storage.XMLLoader('Job') as (doc:chararray);
>
> loading other format file with option of USING PigStorage('') is 
> working fine.
>
> Thanks in advance for your help.
>
> Regards,
>
> Pravin
>
>
>
>
> Please consider the environment before printing this e-mail
>
> Disclaimer: This  communication  is  for the exclusive use of the intended recipient(s) and  shall  not attach any liability on the originator or ITC Infotech India Ltd./its  Holding company/ its Subsidiaries/ its Group Companies. If you are the addressee, the contents of this e-mail are intended for your use only and it shall  not be forwarded to any third party, without first obtaining written authorization from the originator or ITC Infotech India Ltd./ its Holding company/its  Subsidiaries/ its Group Companies. It may contain information which is confidential and legally privileged and the same shall not be used or dealt with  by any  third  party  in  any manner whatsoever without the specific consent  of  ITC  Infotech India Ltd./ its Holding company/ its Subsidiaries/ its Group Companies. 
>
>