You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Manhee Jo <jo...@nttdocomo.com> on 2009/05/13 05:04:28 UTC

Execution Error: ExecDriver

Hello. Can anybody help me please?
When I was running


hive> from ( from pref select transform(pref.id, pref.pref) as (oid, opref)
    > using '/home/hadoop/work/test.pl' cluster by oid ) tmap
    > insert overwrite table pref_new select tmap.oid, tmap.opref;


I came up with an execution error saying


FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.ExecDriver


The extended description of pref_new is


hive> describe extended pref_new;
OK
id int
pref string
Detailed Table Information:
Table(tableName:pref_new,dbName:default,owner:hadoop,createTime:1242036442,lastAccessTime:0,retention:0,sd:StorageDescriptor(cols:[FieldSchema(name:id,type:int,comment:null), 
FieldSchema(name:pref,type:string,comment:null)],location:/user/hive/warehouse/pref_new,inputFormat:org.apache.hadoop.mapred.TextInputFormat,outputFormat:org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat,compressed:false,numBuckets:-1,serdeInfo:SerDeInfo(name:null,serializationLib:org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe,parameters:{serialization.format=,,line.delim=
,field.delim=,}),bucketCols:[],sortCols:[],parameters:{}),partitionKeys:[],parameters:{SORTBUCKETCOLSPREFIX=TRUE})


test.pl is


#!/usr/bin/perl

while (<>) {
    chop;
    my(@w) = split('\t', $_);
    $w[1] =~ s/abc$//;
    printf("%d\t%s\n", $w[0], $w[1]);
}


Contents of /tmp/hadoop/hive.log is


2009-05-13 10:13:58,824 WARN  hive.log 
(MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref { i32 
id, string pref}
2009-05-13 10:13:58,953 WARN  hive.log 
(MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref_new { 
i32 id, string pref}
2009-05-13 10:13:59,065 WARN  exec.ExecDriver 
(ExecDriver.java:fillInDefaults(82)) - Number of reduce tasks not specified. 
Defaulting to jobconf value of: 1
2009-05-13 10:14:00,083 WARN  mapred.JobClient 
(JobClient.java:configureCommandLineOptions(547)) - Use GenericOptionsParser 
for parsing the arguments. Applications should implement Tool for the same.
2009-05-13 10:14:21,118 ERROR exec.ExecDriver 
(SessionState.java:printError(242)) - Ended Job = job_200905130946_0002 with 
errors
2009-05-13 10:14:21,129 ERROR ql.Driver 
(SessionState.java:printError(242)) - FAILED: Execution Error, return code 2 
from org.apache.hadoop.hive.ql.exec.ExecDriver
2009-05-13 10:14:35,294 WARN  hive.log 
(MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref_new { 
i32 id, string pref}


And task details of a failed attempt from the web browser 
(http://...:50030/taskdetails.jsp? ...) is


java.lang.RuntimeException: Map operator initialization failed
 at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:62)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
 at org.apache.hadoop.mapred.Child.main(Child.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Cannot 
initialize ScriptOperator
 at 
org.apache.hadoop.hive.ql.exec.ScriptOperator.initialize(ScriptOperator.java:130)
 at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:180)
 at 
org.apache.hadoop.hive.ql.exec.SelectOperator.initialize(SelectOperator.java:48)
 at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:180)
 at 
org.apache.hadoop.hive.ql.exec.ForwardOperator.initialize(ForwardOperator.java:35)
 at 
org.apache.hadoop.hive.ql.exec.MapOperator.initialize(MapOperator.java:166)
 at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:57)
 ... 3 more
Caused by: java.io.IOException: Cannot run program 
"/home/hadoop/work/test.pl": java.io.IOException: error=2, No such file or 
directory
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:459)
 at 
org.apache.hadoop.hive.ql.exec.ScriptOperator.initialize(ScriptOperator.java:117)
 ... 9 more
Caused by: java.io.IOException: java.io.IOException: error=2, No such file 
or directory
 at java.lang.UNIXProcess.(UNIXProcess.java:148)
 at java.lang.ProcessImpl.start(ProcessImpl.java:65)
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
 ... 10 more


It seems that Java cannot find out the file test.pl. But it surely is there. 
How should I configure directories so that Jave can access it?


Thanks,
Manhee

Re: Execution Error: ExecDriver

Posted by Manhee Jo <jo...@nttdocomo.com>.

Thankyou.
I tried changing order but came up with a semantic error, which I could 
solve.
All are done well. Thank you so much.


Regards,
Manhee
  ----- Original Message ----- 
  From: Joydeep Sen Sarma
  To: hive-user@hadoop.apache.org
  Cc: Jeff Hammerbacher
  Sent: Thursday, May 14, 2009 2:14 PM
  Subject: RE: Execution Error: ExecDriver


  Quotes not required. This is not going through the rest of the hive 
language parser that requires quotes. So that add file is working fine 
(there was something wrong about the earlier use - perhaps an extra 
character or something).



  Trying inverting the order of the 'using' and 'as' clauses.



  For the using clause - please use the name of the file only (test.pl) and 
not the entire file path. The original path name is irrelevant (since the 
file gets copied into the distributed cache).




------------------------------------------------------------------------------

  From: Manhee Jo [mailto:jo@nttdocomo.com]
  Sent: Wednesday, May 13, 2009 10:08 PM
  To: hive-user@hadoop.apache.org
  Cc: Jeff Hammerbacher
  Subject: Re: Execution Error: ExecDriver



  Thank you all. Am progressing little by little.





  hive> !ls -l;

  ...

  -rwxr-xr-x 1 hadoop user 134 2009-xxx xx:xx test.pl

  ...



  hive> add file '/home/hadoop/work/test.pl';

  '/home/hadoop/work/test.pl' does not exist



  hive> add file "/home/hadoop/work/test.pl";

  "/home/hadoop/work/test.pl" does not exist



  hive> add file /home/hadoop/work/test.pl;

  hive> list files;

  /home/hadoop/work/test.pl





  Do I really need single or double quotation mark?

  In addition I didn't use hive under contrib, Anted 
with -Dhadoop.version="0.19.1"

  Anyway, now I could add test.pl. The next error is



  hive> FROM (
      >   FROM pref
      >   SELECT TRANSFORM(pref.id, pref.pref) AS (oid, opref)
      >          USING '/home/hiveuser/test.pl'
      >   CLUSTER BY oid
      > ) tmap
      > INSERT OVERWRITE TABLE pref_new SELECT tmap.oid, tmap.opref;

  FAILED: Parse Error: line 4:38 mismatched input 'AS' expecting USING in 
transform clause





  Any suggestions please?





  Thanks,

  Manhee





    ----- Original Message ----- 

    From: Zheng Shao

    To: hive-user@hadoop.apache.org

    Cc: Jeff Hammerbacher

    Sent: Thursday, May 14, 2009 2:00 AM

    Subject: Re: Execution Error: ExecDriver



    Another problem is that it seems that the file name is not quoted.
    We need to use either single or double quotation mark (' or ") to quote 
the whole path.

    Zheng

    On Wed, May 13, 2009 at 8:40 PM, Joydeep Sen Sarma 
<js...@facebook.com> wrote:

     This should work. What version of hive are u running? (it almost seems 
that the add functionality is not implemented - which it has been forever. 
Hope you aren't using hive from the contrib. section of hadoop-19)




----------------------------------------------------------------------------

    From: Manhee Jo [mailto:jo@nttdocomo.com]
    Sent: Tuesday, May 12, 2009 11:22 PM
    To: hive-user@hadoop.apache.org
    Cc: Jeff Hammerbacher


    Subject: Re: Execution Error: ExecDriver

    Thank you, Jeff.
    This time I've got a different one.




    hive> add file /home/hadoop/work/test.pl;
    FAILED: Parse Error: line 1:0 cannot recognize input 'add'



    java.lang.NullPointerException
     at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:221)
     at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:112)
     at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:137)
     at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:234)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     at java.lang.reflect.Method.invoke(Method.java:597)
     at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
     at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
     at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)




    Is this due to incompatibility among different Hadoop version?

    I am running hive over hadoop 0.19.1




    Thanks,
    Manhee

      ----- Original Message ----- 

      From: Jeff Hammerbacher

      To: jo@nttdocomo.com

      Sent: Wednesday, May 13, 2009 12:40 PM

      Subject: Re: Execution Error: ExecDriver



      Hey Manhee,

      Try saying "ADD FILE test.pl" from the Hive CLI before running the 
job. Hive executes custom MapReduce jobs using the Hadoop Streaming 
interface, and ADD FILE is Hive's way of sticking your file into the 
DistributedCache so that it's accessible on the nodes running your tasks 
(similar to the "-file" option for Hadoop Streaming). See 
http://hadoop.apache.org/core/docs/r0.20.0/streaming.html#Package+Files+With+Job+Submissions 
for more details on the Streaming side of things; I couldn't find much 
documentation for the Hive side of things other than 
http://wiki.apache.org/hadoop/Hive/LanguageManual/Cli and 
http://wiki.apache.org/hadoop/Hive/LanguageManual/Transform.


      so, you'd do:
      hive> add file /home/hadoop/work/test.pl
      hive> from ( .....) using 'test.pl' cluster by ....

      Regards,
      Jeff

      --- snip ---







      2009/5/12 Manhee Jo <jo...@nttdocomo.com>

      Hello. Can anybody help me please?
      When I was running





      hive> from ( from pref select transform(pref.id, pref.pref) as (oid, 
opref)
          > using '/home/hadoop/work/test.pl' cluster by oid ) tmap
          > insert overwrite table pref_new select tmap.oid, tmap.opref;




      I came up with an execution error saying




      FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.ExecDriver




      The extended description of pref_new is




      hive> describe extended pref_new;
      OK
      id int
      pref string
      Detailed Table Information:
      Table(tableName:pref_new,dbName:default,owner:hadoop,createTime:1242036442,lastAccessTime:0,retention:0,sd:StorageDescriptor(cols:[FieldSchema(name:id,type:int,comment:null), 
FieldSchema(name:pref,type:string,comment:null)],location:/user/hive/warehouse/pref_new,inputFormat:org.apache.hadoop.mapred.TextInputFormat,outputFormat:org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat,compressed:false,numBuckets:-1,serdeInfo:SerDeInfo(name:null,serializationLib:org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe,parameters:{serialization.format=,,line.delim=
      ,field.delim=,}),bucketCols:[],sortCols:[],parameters:{}),partitionKeys:[],parameters:{SORTBUCKETCOLSPREFIX=TRUE})




      test.pl is




      #!/usr/bin/perl



      while (<>) {
          chop;
          my(@w) = split('\t', $_);
          $w[1] =~ s/abc$//;
          printf("%d\t%s\n", $w[0], $w[1]);
      }




      Contents of /tmp/hadoop/hive.log is




      2009-05-13 10:13:58,824 WARN  hive.log 
(MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref { i32 
id, string pref}
      2009-05-13 10:13:58,953 WARN  hive.log 
(MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref_new { 
i32 id, string pref}
      2009-05-13 10:13:59,065 WARN  exec.ExecDriver 
(ExecDriver.java:fillInDefaults(82)) - Number of reduce tasks not specified. 
Defaulting to jobconf value of: 1
      2009-05-13 10:14:00,083 WARN  mapred.JobClient 
(JobClient.java:configureCommandLineOptions(547)) - Use GenericOptionsParser 
for parsing the arguments. Applications should implement Tool for the same.
      2009-05-13 10:14:21,118 ERROR exec.ExecDriver 
(SessionState.java:printError(242)) - Ended Job = job_200905130946_0002 with 
errors
      2009-05-13 10:14:21,129 ERROR ql.Driver 
(SessionState.java:printError(242)) - FAILED: Execution Error, return code 2 
from org.apache.hadoop.hive.ql.exec.ExecDriver
      2009-05-13 10:14:35,294 WARN  hive.log 
(MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref_new { 
i32 id, string pref}




      And task details of a failed attempt from the web browser 
(http://...:50030/taskdetails.jsp? ...) is




      java.lang.RuntimeException: Map operator initialization failed
       at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:62)
       at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
       at org.apache.hadoop.mapred.Child.main(Child.java:158)
      Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Cannot 
initialize ScriptOperator
       at 
org.apache.hadoop.hive.ql.exec.ScriptOperator.initialize(ScriptOperator.java:130)
       at 
org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:180)
       at 
org.apache.hadoop.hive.ql.exec.SelectOperator.initialize(SelectOperator.java:48)
       at 
org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:180)
       at 
org.apache.hadoop.hive.ql.exec.ForwardOperator.initialize(ForwardOperator.java:35)
       at 
org.apache.hadoop.hive.ql.exec.MapOperator.initialize(MapOperator.java:166)
       at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:57)
       ... 3 more
      Caused by: java.io.IOException: Cannot run program 
"/home/hadoop/work/test.pl": java.io.IOException: error=2, No such file or 
directory
       at java.lang.ProcessBuilder.start(ProcessBuilder.java:459)
       at 
org.apache.hadoop.hive.ql.exec.ScriptOperator.initialize(ScriptOperator.java:117)
       ... 9 more
      Caused by: java.io.IOException: java.io.IOException: error=2, No such 
file or directory
       at java.lang.UNIXProcess.(UNIXProcess.java:148)
       at java.lang.ProcessImpl.start(ProcessImpl.java:65)
       at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
       ... 10 more




      It seems that Java cannot find out the file test.pl. But it surely is 
there. How should I configure directories so that Jave can access it?




      Thanks,
      Manhee








    -- 
    Yours,
    Zheng

RE: Execution Error: ExecDriver

Posted by Joydeep Sen Sarma <js...@facebook.com>.

Quotes not required. This is not going through the rest of the hive language parser that requires quotes. So that add file is working fine (there was something wrong about the earlier use - perhaps an extra character or something).

Trying inverting the order of the 'using' and 'as' clauses.

For the using clause - please use the name of the file only (test.pl) and not the entire file path. The original path name is irrelevant (since the file gets copied into the distributed cache).

________________________________
From: Manhee Jo [mailto:jo@nttdocomo.com]
Sent: Wednesday, May 13, 2009 10:08 PM
To: hive-user@hadoop.apache.org
Cc: Jeff Hammerbacher
Subject: Re: Execution Error: ExecDriver

Thank you all. Am progressing little by little.

hive> !ls -l;
...
-rwxr-xr-x 1 hadoop user 134 2009-xxx xx:xx test.pl
...

hive> add file '/home/hadoop/work/test.pl';
'/home/hadoop/work/test.pl' does not exist

hive> add file "/home/hadoop/work/test.pl";
"/home/hadoop/work/test.pl" does not exist

hive> add file /home/hadoop/work/test.pl;
hive> list files;
/home/hadoop/work/test.pl

Do I really need single or double quotation mark?
In addition I didn't use hive under contrib, Anted with -Dhadoop.version="0.19.1"
Anyway, now I could add test.pl. The next error is

hive> FROM (
    >   FROM pref
    >   SELECT TRANSFORM(pref.id, pref.pref) AS (oid, opref)
    >          USING '/home/hiveuser/test.pl'
    >   CLUSTER BY oid
    > ) tmap
    > INSERT OVERWRITE TABLE pref_new SELECT tmap.oid, tmap.opref;
FAILED: Parse Error: line 4:38 mismatched input 'AS' expecting USING in transform clause

Any suggestions please?

Thanks,
Manhee

----- Original Message -----
From: Zheng Shao<ma...@gmail.com>
To: hive-user@hadoop.apache.org<ma...@hadoop.apache.org>
Cc: Jeff Hammerbacher<ma...@cloudera.com>
Sent: Thursday, May 14, 2009 2:00 AM
Subject: Re: Execution Error: ExecDriver

Another problem is that it seems that the file name is not quoted.
We need to use either single or double quotation mark (' or ") to quote the whole path.

Zheng
On Wed, May 13, 2009 at 8:40 PM, Joydeep Sen Sarma <js...@facebook.com>> wrote:

 This should work. What version of hive are u running? (it almost seems that the add functionality is not implemented - which it has been forever. Hope you aren't using hive from the contrib. section of hadoop-19)

________________________________

From: Manhee Jo [mailto:jo@nttdocomo.com<ma...@nttdocomo.com>]
Sent: Tuesday, May 12, 2009 11:22 PM
To: hive-user@hadoop.apache.org<ma...@hadoop.apache.org>
Cc: Jeff Hammerbacher

Subject: Re: Execution Error: ExecDriver

Thank you, Jeff.
This time I've got a different one.

hive> add file /home/hadoop/work/test.pl;
FAILED: Parse Error: line 1:0 cannot recognize input 'add'

java.lang.NullPointerException
 at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:221)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:112)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:137)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:234)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
 at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
 at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

Is this due to incompatibility among different Hadoop version?

I am running hive over hadoop 0.19.1

Thanks,
Manhee

----- Original Message -----

From: Jeff Hammerbacher<ma...@cloudera.com>

To: jo@nttdocomo.com<ma...@nttdocomo.com>

Sent: Wednesday, May 13, 2009 12:40 PM

Subject: Re: Execution Error: ExecDriver

Hey Manhee,

Try saying "ADD FILE test.pl" from the Hive CLI before running the job. Hive executes custom MapReduce jobs using the Hadoop Streaming interface, and ADD FILE is Hive's way of sticking your file into the DistributedCache so that it's accessible on the nodes running your tasks (similar to the "-file" option for Hadoop Streaming). See http://hadoop.apache.org/core/docs/r0.20.0/streaming.html#Package+Files+With+Job+Submissions for more details on the Streaming side of things; I couldn't find much documentation for the Hive side of things other than http://wiki.apache.org/hadoop/Hive/LanguageManual/Cli and http://wiki.apache.org/hadoop/Hive/LanguageManual/Transform.

so, you'd do:
hive> add file /home/hadoop/work/test.pl
hive> from ( .....) using 'test.pl' cluster by ....

Regards,
Jeff

--- snip ---

2009/5/12 Manhee Jo <jo...@nttdocomo.com>>

Hello. Can anybody help me please?
When I was running

hive> from ( from pref select transform(pref.id<http://pref.id>, pref.pref) as (oid, opref)
    > using '/home/hadoop/work/test.pl' cluster by oid ) tmap
    > insert overwrite table pref_new select tmap.oid, tmap.opref;

I came up with an execution error saying

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.ExecDriver

The extended description of pref_new is

hive> describe extended pref_new;
OK
id int
pref string
Detailed Table Information:
Table(tableName:pref_new,dbName:default,owner:hadoop,createTime:1242036442,lastAccessTime:0,retention:0,sd:StorageDescriptor(cols:[FieldSchema(name:id,type:int,comment:null), FieldSchema(name:pref,type:string,comment:null)],location:/user/hive/warehouse/pref_new,inputFormat:org.apache.hadoop.mapred.TextInputFormat,outputFormat:org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat,compressed:false,numBuckets:-1,serdeInfo:SerDeInfo(name:null,serializationLib:org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe,parameters:{serialization.format=,,line.delim=
,field.delim=,}),bucketCols:[],sortCols:[],parameters:{}),partitionKeys:[],parameters:{SORTBUCKETCOLSPREFIX=TRUE})

test.pl is

#!/usr/bin/perl

while (<>) {
    chop;
    my(@w) = split('\t', $_);
    $w[1] =~ s/abc$//;
    printf("%d\t%s\n", $w[0], $w[1]);
}

Contents of /tmp/hadoop/hive.log is

2009-05-13 10:13:58,824 WARN  hive.log (MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref { i32 id, string pref}
2009-05-13 10:13:58,953 WARN  hive.log (MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref_new { i32 id, string pref}
2009-05-13 10:13:59,065 WARN  exec.ExecDriver (ExecDriver.java:fillInDefaults(82)) - Number of reduce tasks not specified. Defaulting to jobconf value of: 1
2009-05-13 10:14:00,083 WARN  mapred.JobClient (JobClient.java:configureCommandLineOptions(547)) - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2009-05-13 10:14:21,118 ERROR exec.ExecDriver (SessionState.java:printError(242)) - Ended Job = job_200905130946_0002 with errors
2009-05-13 10:14:21,129 ERROR ql.Driver (SessionState.java:printError(242)) - FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.ExecDriver
2009-05-13 10:14:35,294 WARN  hive.log (MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref_new { i32 id, string pref}

And task details of a failed attempt from the web browser (http://...:50030/taskdetails.jsp? ...) is

java.lang.RuntimeException: Map operator initialization failed
 at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:62)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
 at org.apache.hadoop.mapred.Child.main(Child.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Cannot initialize ScriptOperator
 at org.apache.hadoop.hive.ql.exec.ScriptOperator.initialize(ScriptOperator.java:130)
 at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:180)
 at org.apache.hadoop.hive.ql.exec.SelectOperator.initialize(SelectOperator.java:48)
 at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:180)
 at org.apache.hadoop.hive.ql.exec.ForwardOperator.initialize(ForwardOperator.java:35)
 at org.apache.hadoop.hive.ql.exec.MapOperator.initialize(MapOperator.java:166)
 at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:57)
 ... 3 more
Caused by: java.io.IOException: Cannot run program "/home/hadoop/work/test.pl": java.io.IOException: error=2, No such file or directory
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:459)
 at org.apache.hadoop.hive.ql.exec.ScriptOperator.initialize(ScriptOperator.java:117)
 ... 9 more
Caused by: java.io.IOException: java.io.IOException: error=2, No such file or directory
 at java.lang.UNIXProcess.(UNIXProcess.java:148)
 at java.lang.ProcessImpl.start(ProcessImpl.java:65)
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
 ... 10 more

It seems that Java cannot find out the file test.pl. But it surely is there. How should I configure directories so that Jave can access it?

Thanks,
Manhee

--
Yours,
Zheng

Re: Execution Error: ExecDriver

Posted by Manhee Jo <jo...@nttdocomo.com>.

Thank you all. Am progressing little by little.


hive> !ls -l;
...
-rwxr-xr-x 1 hadoop user 134 2009-xxx xx:xx test.pl
...

hive> add file '/home/hadoop/work/test.pl';
'/home/hadoop/work/test.pl' does not exist

hive> add file "/home/hadoop/work/test.pl";
"/home/hadoop/work/test.pl" does not exist

hive> add file /home/hadoop/work/test.pl;
hive> list files;
/home/hadoop/work/test.pl


Do I really need single or double quotation mark?
In addition I didn't use hive under contrib, Anted 
with -Dhadoop.version="0.19.1"
Anyway, now I could add test.pl. The next error is

hive> FROM (
    >   FROM pref
    >   SELECT TRANSFORM(pref.id, pref.pref) AS (oid, opref)
    >          USING '/home/hiveuser/test.pl'
    >   CLUSTER BY oid
    > ) tmap
    > INSERT OVERWRITE TABLE pref_new SELECT tmap.oid, tmap.opref;
FAILED: Parse Error: line 4:38 mismatched input 'AS' expecting USING in 
transform clause


Any suggestions please?


Thanks,
Manhee


  ----- Original Message ----- 
  From: Zheng Shao
  To: hive-user@hadoop.apache.org
  Cc: Jeff Hammerbacher
  Sent: Thursday, May 14, 2009 2:00 AM
  Subject: Re: Execution Error: ExecDriver


  Another problem is that it seems that the file name is not quoted.
  We need to use either single or double quotation mark (' or ") to quote 
the whole path.

  Zheng


  On Wed, May 13, 2009 at 8:40 PM, Joydeep Sen Sarma <js...@facebook.com> 
wrote:

     This should work. What version of hive are u running? (it almost seems 
that the add functionality is not implemented – which it has been forever. 
Hope you aren’t using hive from the contrib. section of hadoop-19)




----------------------------------------------------------------------------

    From: Manhee Jo [mailto:jo@nttdocomo.com]
    Sent: Tuesday, May 12, 2009 11:22 PM
    To: hive-user@hadoop.apache.org
    Cc: Jeff Hammerbacher


    Subject: Re: Execution Error: ExecDriver

    Thank you, Jeff.
    This time I've got a different one.




    hive> add file /home/hadoop/work/test.pl;
    FAILED: Parse Error: line 1:0 cannot recognize input 'add'



    java.lang.NullPointerException
     at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:221)
     at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:112)
     at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:137)
     at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:234)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     at java.lang.reflect.Method.invoke(Method.java:597)
     at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
     at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
     at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)




    Is this due to incompatibility among different Hadoop version?

    I am running hive over hadoop 0.19.1




    Thanks,
    Manhee

      ----- Original Message ----- 

      From: Jeff Hammerbacher

      To: jo@nttdocomo.com

      Sent: Wednesday, May 13, 2009 12:40 PM

      Subject: Re: Execution Error: ExecDriver



      Hey Manhee,

      Try saying "ADD FILE test.pl" from the Hive CLI before running the 
job. Hive executes custom MapReduce jobs using the Hadoop Streaming 
interface, and ADD FILE is Hive's way of sticking your file into the 
DistributedCache so that it's accessible on the nodes running your tasks 
(similar to the "-file" option for Hadoop Streaming). See 
http://hadoop.apache.org/core/docs/r0.20.0/streaming.html#Package+Files+With+Job+Submissions 
for more details on the Streaming side of things; I couldn't find much 
documentation for the Hive side of things other than 
http://wiki.apache.org/hadoop/Hive/LanguageManual/Cli and 
http://wiki.apache.org/hadoop/Hive/LanguageManual/Transform.


      so, you'd do:
      hive> add file /home/hadoop/work/test.pl
      hive> from ( .....) using 'test.pl' cluster by ....

      Regards,
      Jeff

      --- snip ---







      2009/5/12 Manhee Jo <jo...@nttdocomo.com>

      Hello. Can anybody help me please?
      When I was running





      hive> from ( from pref select transform(pref.id, pref.pref) as (oid, 
opref)
          > using '/home/hadoop/work/test.pl' cluster by oid ) tmap
          > insert overwrite table pref_new select tmap.oid, tmap.opref;




      I came up with an execution error saying




      FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.ExecDriver




      The extended description of pref_new is




      hive> describe extended pref_new;
      OK
      id int
      pref string
      Detailed Table Information:
      Table(tableName:pref_new,dbName:default,owner:hadoop,createTime:1242036442,lastAccessTime:0,retention:0,sd:StorageDescriptor(cols:[FieldSchema(name:id,type:int,comment:null), 
FieldSchema(name:pref,type:string,comment:null)],location:/user/hive/warehouse/pref_new,inputFormat:org.apache.hadoop.mapred.TextInputFormat,outputFormat:org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat,compressed:false,numBuckets:-1,serdeInfo:SerDeInfo(name:null,serializationLib:org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe,parameters:{serialization.format=,,line.delim=
      ,field.delim=,}),bucketCols:[],sortCols:[],parameters:{}),partitionKeys:[],parameters:{SORTBUCKETCOLSPREFIX=TRUE})




      test.pl is




      #!/usr/bin/perl



      while (<>) {
          chop;
          my(@w) = split('\t', $_);
          $w[1] =~ s/abc$//;
          printf("%d\t%s\n", $w[0], $w[1]);
      }




      Contents of /tmp/hadoop/hive.log is




      2009-05-13 10:13:58,824 WARN  hive.log 
(MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref { i32 
id, string pref}
      2009-05-13 10:13:58,953 WARN  hive.log 
(MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref_new { 
i32 id, string pref}
      2009-05-13 10:13:59,065 WARN  exec.ExecDriver 
(ExecDriver.java:fillInDefaults(82)) - Number of reduce tasks not specified. 
Defaulting to jobconf value of: 1
      2009-05-13 10:14:00,083 WARN  mapred.JobClient 
(JobClient.java:configureCommandLineOptions(547)) - Use GenericOptionsParser 
for parsing the arguments. Applications should implement Tool for the same.
      2009-05-13 10:14:21,118 ERROR exec.ExecDriver 
(SessionState.java:printError(242)) - Ended Job = job_200905130946_0002 with 
errors
      2009-05-13 10:14:21,129 ERROR ql.Driver 
(SessionState.java:printError(242)) - FAILED: Execution Error, return code 2 
from org.apache.hadoop.hive.ql.exec.ExecDriver
      2009-05-13 10:14:35,294 WARN  hive.log 
(MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref_new { 
i32 id, string pref}




      And task details of a failed attempt from the web browser 
(http://...:50030/taskdetails.jsp? ...) is




      java.lang.RuntimeException: Map operator initialization failed
       at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:62)
       at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
       at org.apache.hadoop.mapred.Child.main(Child.java:158)
      Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Cannot 
initialize ScriptOperator
       at 
org.apache.hadoop.hive.ql.exec.ScriptOperator.initialize(ScriptOperator.java:130)
       at 
org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:180)
       at 
org.apache.hadoop.hive.ql.exec.SelectOperator.initialize(SelectOperator.java:48)
       at 
org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:180)
       at 
org.apache.hadoop.hive.ql.exec.ForwardOperator.initialize(ForwardOperator.java:35)
       at 
org.apache.hadoop.hive.ql.exec.MapOperator.initialize(MapOperator.java:166)
       at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:57)
       ... 3 more
      Caused by: java.io.IOException: Cannot run program 
"/home/hadoop/work/test.pl": java.io.IOException: error=2, No such file or 
directory
       at java.lang.ProcessBuilder.start(ProcessBuilder.java:459)
       at 
org.apache.hadoop.hive.ql.exec.ScriptOperator.initialize(ScriptOperator.java:117)
       ... 9 more
      Caused by: java.io.IOException: java.io.IOException: error=2, No such 
file or directory
       at java.lang.UNIXProcess.(UNIXProcess.java:148)
       at java.lang.ProcessImpl.start(ProcessImpl.java:65)
       at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
       ... 10 more




      It seems that Java cannot find out the file test.pl. But it surely is 
there. How should I configure directories so that Jave can access it?




      Thanks,
      Manhee








  -- 
  Yours,
  Zheng

Re: Execution Error: ExecDriver

Posted by Zheng Shao <zs...@gmail.com>.

Another problem is that it seems that the file name is not quoted.
We need to use either single or double quotation mark (' or ") to quote the
whole path.

Zheng

On Wed, May 13, 2009 at 8:40 PM, Joydeep Sen Sarma <js...@facebook.com>wrote:

>    This should work. What version of hive are u running? (it almost seems
> that the add functionality is not implemented – which it has been forever.
> Hope you aren’t using hive from the contrib. section of hadoop-19)
>
>
>  ------------------------------
>
> *From:* Manhee Jo [mailto:jo@nttdocomo.com]
> *Sent:* Tuesday, May 12, 2009 11:22 PM
> *To:* hive-user@hadoop.apache.org
> *Cc:* Jeff Hammerbacher
>
> *Subject:* Re: Execution Error: ExecDriver
>
> Thank you, Jeff.
> This time I've got a different one.
>
>
>
>
> hive> add file /home/hadoop/work/test.pl;
> FAILED: Parse Error: line 1:0 cannot recognize input 'add'
>
>
>
> java.lang.NullPointerException
>  at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:221)
>  at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:112)
>  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:137)
>  at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:234)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>  at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>  at java.lang.reflect.Method.invoke(Method.java:597)
>  at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
>  at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>  at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
>
>
>
>
> Is this due to incompatibility among different Hadoop version?
>
> I am running hive over hadoop 0.19.1
>
>
>
>
> Thanks,
> Manhee
>
>  ----- Original Message -----
>
> *From:* Jeff Hammerbacher <ha...@cloudera.com>
>
> *To:* jo@nttdocomo.com
>
> *Sent:* Wednesday, May 13, 2009 12:40 PM
>
> *Subject:* Re: Execution Error: ExecDriver
>
>
>
> Hey Manhee,
>
> Try saying "ADD FILE test.pl" from the Hive CLI before running the job.
> Hive executes custom MapReduce jobs using the Hadoop Streaming interface,
> and ADD FILE is Hive's way of sticking your file into the DistributedCache
> so that it's accessible on the nodes running your tasks (similar to the
> "-file" option for Hadoop Streaming). See
> http://hadoop.apache.org/core/docs/r0.20.0/streaming.html#Package+Files+With+Job+Submissionsfor more details on the Streaming side of things; I couldn't find much
> documentation for the Hive side of things other than
> http://wiki.apache.org/hadoop/Hive/LanguageManual/Cli and
> http://wiki.apache.org/hadoop/Hive/LanguageManual/Transform.
>
>
> so, you'd do:
> hive> add file /home/hadoop/work/test.pl
> hive> from ( .....) using 'test.pl' cluster by ....
>
> Regards,
> Jeff
>
> --- snip ---
>
>
>
>
>
>
>
> 2009/5/12 Manhee Jo <jo...@nttdocomo.com>
>
> Hello. Can anybody help me please?
> When I was running
>
>
>
>
>
> hive> from ( from pref select transform(pref.id, pref.pref) as (oid,
> opref)
>     > using '/home/hadoop/work/test.pl' cluster by oid ) tmap
>     > insert overwrite table pref_new select tmap.oid, tmap.opref;
>
>
>
>
> I came up with an execution error saying
>
>
>
>
> FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.ExecDriver
>
>
>
>
> The extended description of pref_new is
>
>
>
>
> hive> describe extended pref_new;
> OK
> id int
> pref string
> Detailed Table Information:
> Table(tableName:pref_new,dbName:default,owner:hadoop,createTime:1242036442,lastAccessTime:0,retention:0,sd:StorageDescriptor(cols:[FieldSchema(name:id,type:int,comment:null),
> FieldSchema(name:pref,type:string,comment:null)],location:/user/hive/warehouse/pref_new,inputFormat:org.apache.hadoop.mapred.TextInputFormat,outputFormat:org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat,compressed:false,numBuckets:-1,serdeInfo:SerDeInfo(name:null,serializationLib:org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe,parameters:{serialization.format=,,line.delim=
>
> ,field.delim=,}),bucketCols:[],sortCols:[],parameters:{}),partitionKeys:[],parameters:{SORTBUCKETCOLSPREFIX=TRUE})
>
>
>
>
> test.pl is
>
>
>
>
> #!/usr/bin/perl
>
>
>
> while (<>) {
>     chop;
>     my(@w) = split('\t', $_);
>     $w[1] =~ s/abc$//;
>     printf("%d\t%s\n", $w[0], $w[1]);
> }
>
>
>
>
> Contents of /tmp/hadoop/hive.log is
>
>
>
>
> 2009-05-13 10:13:58,824 WARN  hive.log
> (MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref { i32
> id, string pref}
> 2009-05-13 10:13:58,953 WARN  hive.log
> (MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref_new {
> i32 id, string pref}
> 2009-05-13 10:13:59,065 WARN  exec.ExecDriver
> (ExecDriver.java:fillInDefaults(82)) - Number of reduce tasks not specified.
> Defaulting to jobconf value of: 1
> 2009-05-13 10:14:00,083 WARN  mapred.JobClient
> (JobClient.java:configureCommandLineOptions(547)) - Use GenericOptionsParser
> for parsing the arguments. Applications should implement Tool for the same.
> 2009-05-13 10:14:21,118 ERROR exec.ExecDriver
> (SessionState.java:printError(242)) - Ended Job = job_200905130946_0002 with
> errors
> 2009-05-13 10:14:21,129 ERROR ql.Driver (SessionState.java:printError(242))
> - FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.ExecDriver
> 2009-05-13 10:14:35,294 WARN  hive.log
> (MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref_new {
> i32 id, string pref}
>
>
>
>
> And task details of a failed attempt from the web browser (
> http://...:50030/taskdetails.jsp? ...) is
>
>
>
>
> java.lang.RuntimeException: Map operator initialization failed
>  at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:62)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>  at org.apache.hadoop.mapred.Child.main(Child.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Cannot
> initialize ScriptOperator
>  at
> org.apache.hadoop.hive.ql.exec.ScriptOperator.initialize(ScriptOperator.java:130)
>  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:180)
>  at
> org.apache.hadoop.hive.ql.exec.SelectOperator.initialize(SelectOperator.java:48)
>  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:180)
>  at
> org.apache.hadoop.hive.ql.exec.ForwardOperator.initialize(ForwardOperator.java:35)
>  at
> org.apache.hadoop.hive.ql.exec.MapOperator.initialize(MapOperator.java:166)
>  at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:57)
>  ... 3 more
> Caused by: java.io.IOException: Cannot run program
> "/home/hadoop/work/test.pl": java.io.IOException: error=2, No such file or
> directory
>  at java.lang.ProcessBuilder.start(ProcessBuilder.java:459)
>  at
> org.apache.hadoop.hive.ql.exec.ScriptOperator.initialize(ScriptOperator.java:117)
>  ... 9 more
> Caused by: java.io.IOException: java.io.IOException: error=2, No such file
> or directory
>  at java.lang.UNIXProcess.(UNIXProcess.java:148)
>  at java.lang.ProcessImpl.start(ProcessImpl.java:65)
>  at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
>  ... 10 more
>
>
>
>
> It seems that Java cannot find out the file test.pl. But it surely is
> there. How should I configure directories so that Jave can access it?
>
>
>
>
> Thanks,
> Manhee
>
>
>
>
>
>


-- 
Yours,
Zheng

RE: Execution Error: ExecDriver

Posted by Joydeep Sen Sarma <js...@facebook.com>.

 This should work. What version of hive are u running? (it almost seems that the add functionality is not implemented - which it has been forever. Hope you aren't using hive from the contrib. section of hadoop-19)

________________________________
From: Manhee Jo [mailto:jo@nttdocomo.com]
Sent: Tuesday, May 12, 2009 11:22 PM
To: hive-user@hadoop.apache.org
Cc: Jeff Hammerbacher
Subject: Re: Execution Error: ExecDriver
Thank you, Jeff.
This time I've got a different one.

hive> add file /home/hadoop/work/test.pl;
FAILED: Parse Error: line 1:0 cannot recognize input 'add'

java.lang.NullPointerException
 at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:221)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:112)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:137)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:234)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
 at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
 at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

Is this due to incompatibility among different Hadoop version?
I am running hive over hadoop 0.19.1

Thanks,
Manhee
----- Original Message -----
From: Jeff Hammerbacher<ma...@cloudera.com>
To: jo@nttdocomo.com<ma...@nttdocomo.com>
Sent: Wednesday, May 13, 2009 12:40 PM
Subject: Re: Execution Error: ExecDriver

Hey Manhee,

Try saying "ADD FILE test.pl" from the Hive CLI before running the job. Hive executes custom MapReduce jobs using the Hadoop Streaming interface, and ADD FILE is Hive's way of sticking your file into the DistributedCache so that it's accessible on the nodes running your tasks (similar to the "-file" option for Hadoop Streaming). See http://hadoop.apache.org/core/docs/r0.20.0/streaming.html#Package+Files+With+Job+Submissions for more details on the Streaming side of things; I couldn't find much documentation for the Hive side of things other than http://wiki.apache.org/hadoop/Hive/LanguageManual/Cli and http://wiki.apache.org/hadoop/Hive/LanguageManual/Transform.

so, you'd do:
hive> add file /home/hadoop/work/test.pl
hive> from ( .....) using 'test.pl' cluster by ....

Regards,
Jeff

--- snip ---

2009/5/12 Manhee Jo <jo...@nttdocomo.com>>
Hello. Can anybody help me please?
When I was running

hive> from ( from pref select transform(pref.id<http://pref.id>, pref.pref) as (oid, opref)
    > using '/home/hadoop/work/test.pl' cluster by oid ) tmap
    > insert overwrite table pref_new select tmap.oid, tmap.opref;

I came up with an execution error saying

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.ExecDriver

The extended description of pref_new is

hive> describe extended pref_new;
OK
id int
pref string
Detailed Table Information:
Table(tableName:pref_new,dbName:default,owner:hadoop,createTime:1242036442,lastAccessTime:0,retention:0,sd:StorageDescriptor(cols:[FieldSchema(name:id,type:int,comment:null), FieldSchema(name:pref,type:string,comment:null)],location:/user/hive/warehouse/pref_new,inputFormat:org.apache.hadoop.mapred.TextInputFormat,outputFormat:org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat,compressed:false,numBuckets:-1,serdeInfo:SerDeInfo(name:null,serializationLib:org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe,parameters:{serialization.format=,,line.delim=
,field.delim=,}),bucketCols:[],sortCols:[],parameters:{}),partitionKeys:[],parameters:{SORTBUCKETCOLSPREFIX=TRUE})

test.pl is

#!/usr/bin/perl

while (<>) {
    chop;
    my(@w) = split('\t', $_);
    $w[1] =~ s/abc$//;
    printf("%d\t%s\n", $w[0], $w[1]);
}

Contents of /tmp/hadoop/hive.log is

2009-05-13 10:13:58,824 WARN  hive.log (MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref { i32 id, string pref}
2009-05-13 10:13:58,953 WARN  hive.log (MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref_new { i32 id, string pref}
2009-05-13 10:13:59,065 WARN  exec.ExecDriver (ExecDriver.java:fillInDefaults(82)) - Number of reduce tasks not specified. Defaulting to jobconf value of: 1
2009-05-13 10:14:00,083 WARN  mapred.JobClient (JobClient.java:configureCommandLineOptions(547)) - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2009-05-13 10:14:21,118 ERROR exec.ExecDriver (SessionState.java:printError(242)) - Ended Job = job_200905130946_0002 with errors
2009-05-13 10:14:21,129 ERROR ql.Driver (SessionState.java:printError(242)) - FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.ExecDriver
2009-05-13 10:14:35,294 WARN  hive.log (MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref_new { i32 id, string pref}

And task details of a failed attempt from the web browser (http://...:50030/taskdetails.jsp? ...) is

java.lang.RuntimeException: Map operator initialization failed
 at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:62)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
 at org.apache.hadoop.mapred.Child.main(Child.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Cannot initialize ScriptOperator
 at org.apache.hadoop.hive.ql.exec.ScriptOperator.initialize(ScriptOperator.java:130)
 at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:180)
 at org.apache.hadoop.hive.ql.exec.SelectOperator.initialize(SelectOperator.java:48)
 at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:180)
 at org.apache.hadoop.hive.ql.exec.ForwardOperator.initialize(ForwardOperator.java:35)
 at org.apache.hadoop.hive.ql.exec.MapOperator.initialize(MapOperator.java:166)
 at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:57)
 ... 3 more
Caused by: java.io.IOException: Cannot run program "/home/hadoop/work/test.pl": java.io.IOException: error=2, No such file or directory
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:459)
 at org.apache.hadoop.hive.ql.exec.ScriptOperator.initialize(ScriptOperator.java:117)
 ... 9 more
Caused by: java.io.IOException: java.io.IOException: error=2, No such file or directory
 at java.lang.UNIXProcess.(UNIXProcess.java:148)
 at java.lang.ProcessImpl.start(ProcessImpl.java:65)
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
 ... 10 more

It seems that Java cannot find out the file test.pl. But it surely is there. How should I configure directories so that Jave can access it?

Thanks,
Manhee

Re: Execution Error: ExecDriver

Posted by Manhee Jo <jo...@nttdocomo.com>.

Thank you, Jeff.
This time I've got a different one.


hive> add file /home/hadoop/work/test.pl;
FAILED: Parse Error: line 1:0 cannot recognize input 'add'

java.lang.NullPointerException
 at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:221)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:112)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:137)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:234)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
 at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
 at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)


Is this due to incompatibility among different Hadoop version?
I am running hive over hadoop 0.19.1


Thanks,
Manhee
  ----- Original Message ----- 
  From: Jeff Hammerbacher
  To: jo@nttdocomo.com
  Sent: Wednesday, May 13, 2009 12:40 PM
  Subject: Re: Execution Error: ExecDriver


  Hey Manhee,

  Try saying "ADD FILE test.pl" from the Hive CLI before running the job. 
Hive executes custom MapReduce jobs using the Hadoop Streaming interface, 
and ADD FILE is Hive's way of sticking your file into the DistributedCache 
so that it's accessible on the nodes running your tasks (similar to the 
"-file" option for Hadoop Streaming). See 
http://hadoop.apache.org/core/docs/r0.20.0/streaming.html#Package+Files+With+Job+Submissions 
for more details on the Streaming side of things; I couldn't find much 
documentation for the Hive side of things other than 
http://wiki.apache.org/hadoop/Hive/LanguageManual/Cli and 
http://wiki.apache.org/hadoop/Hive/LanguageManual/Transform.


  so, you'd do:
  hive> add file /home/hadoop/work/test.pl
  hive> from ( .....) using 'test.pl' cluster by ....

  Regards,
  Jeff

  --- snip ---




  2009/5/12 Manhee Jo <jo...@nttdocomo.com>

    Hello. Can anybody help me please?
    When I was running


    hive> from ( from pref select transform(pref.id, pref.pref) as (oid, 
opref)
        > using '/home/hadoop/work/test.pl' cluster by oid ) tmap
        > insert overwrite table pref_new select tmap.oid, tmap.opref;


    I came up with an execution error saying


    FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.ExecDriver


    The extended description of pref_new is


    hive> describe extended pref_new;
    OK
    id int
    pref string
    Detailed Table Information:
    Table(tableName:pref_new,dbName:default,owner:hadoop,createTime:1242036442,lastAccessTime:0,retention:0,sd:StorageDescriptor(cols:[FieldSchema(name:id,type:int,comment:null), 
FieldSchema(name:pref,type:string,comment:null)],location:/user/hive/warehouse/pref_new,inputFormat:org.apache.hadoop.mapred.TextInputFormat,outputFormat:org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat,compressed:false,numBuckets:-1,serdeInfo:SerDeInfo(name:null,serializationLib:org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe,parameters:{serialization.format=,,line.delim=
    ,field.delim=,}),bucketCols:[],sortCols:[],parameters:{}),partitionKeys:[],parameters:{SORTBUCKETCOLSPREFIX=TRUE})


    test.pl is


    #!/usr/bin/perl

    while (<>) {
        chop;
        my(@w) = split('\t', $_);
        $w[1] =~ s/abc$//;
        printf("%d\t%s\n", $w[0], $w[1]);
    }


    Contents of /tmp/hadoop/hive.log is


    2009-05-13 10:13:58,824 WARN  hive.log 
(MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref { i32 
id, string pref}
    2009-05-13 10:13:58,953 WARN  hive.log 
(MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref_new { 
i32 id, string pref}
    2009-05-13 10:13:59,065 WARN  exec.ExecDriver 
(ExecDriver.java:fillInDefaults(82)) - Number of reduce tasks not specified. 
Defaulting to jobconf value of: 1
    2009-05-13 10:14:00,083 WARN  mapred.JobClient 
(JobClient.java:configureCommandLineOptions(547)) - Use GenericOptionsParser 
for parsing the arguments. Applications should implement Tool for the same.
    2009-05-13 10:14:21,118 ERROR exec.ExecDriver 
(SessionState.java:printError(242)) - Ended Job = job_200905130946_0002 with 
errors
    2009-05-13 10:14:21,129 ERROR ql.Driver 
(SessionState.java:printError(242)) - FAILED: Execution Error, return code 2 
from org.apache.hadoop.hive.ql.exec.ExecDriver
    2009-05-13 10:14:35,294 WARN  hive.log 
(MetaStoreUtils.java:getDDLFromFieldSchema(449)) - DDL: struct pref_new { 
i32 id, string pref}


    And task details of a failed attempt from the web browser 
(http://...:50030/taskdetails.jsp? ...) is


    java.lang.RuntimeException: Map operator initialization failed
     at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:62)
     at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
     at org.apache.hadoop.mapred.Child.main(Child.java:158)
    Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Cannot 
initialize ScriptOperator
     at 
org.apache.hadoop.hive.ql.exec.ScriptOperator.initialize(ScriptOperator.java:130)
     at 
org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:180)
     at 
org.apache.hadoop.hive.ql.exec.SelectOperator.initialize(SelectOperator.java:48)
     at 
org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:180)
     at 
org.apache.hadoop.hive.ql.exec.ForwardOperator.initialize(ForwardOperator.java:35)
     at 
org.apache.hadoop.hive.ql.exec.MapOperator.initialize(MapOperator.java:166)
     at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:57)
     ... 3 more
    Caused by: java.io.IOException: Cannot run program 
"/home/hadoop/work/test.pl": java.io.IOException: error=2, No such file or 
directory
     at java.lang.ProcessBuilder.start(ProcessBuilder.java:459)
     at 
org.apache.hadoop.hive.ql.exec.ScriptOperator.initialize(ScriptOperator.java:117)
     ... 9 more
    Caused by: java.io.IOException: java.io.IOException: error=2, No such 
file or directory
     at java.lang.UNIXProcess.(UNIXProcess.java:148)
     at java.lang.ProcessImpl.start(ProcessImpl.java:65)
     at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
     ... 10 more


    It seems that Java cannot find out the file test.pl. But it surely is 
there. How should I configure directories so that Jave can access it?


    Thanks,
    Manhee