You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by Marcus Herou <ma...@tailsweep.com> on 2009/03/04 11:32:27 UTC

Cannot issue query

Hi.

Started to lab with Hive today since it seems to suit us quite well and
since we are processing our weblogstats with Hadoop as of today and ends up
doing SQL in Hadoop form it seems fair to try out a system which does that
in one step :)

I've created and loaded data into Hive with the following statements;
hive> drop table DailyUniqueSiteVisitorSample;
OK
Time taken: 4.064 seconds
hive> CREATE TABLE DailyUniqueSiteVisitorSample (sampleDate date,uid
bigint,site int,concreteStatistics int,network smallint,category
smallint,country smallint,countryCode String,sessions
smallint,pageImpressions smallint) COMMENT 'This is our weblog stats table'
PARTITIONED BY(dt STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n' STORED AS TEXTFILE;
OK
Time taken: 0.248 seconds
hive> LOAD DATA LOCAL INPATH
'/tmp/data-DenormalizedSiteVisitor.VisitsPi.2009-03-02.csv' INTO TABLE
DailyUniqueSiteVisitorSample PARTITION(dt='2009-03-02');
Copying data from file:/tmp/data-2009-03-02.csv
Loading data to table dailyuniquesitevisitorsample partition {dt=2009-03-02}
OK
Time taken: 2.258 seconds

A little confused about the text-file part but since the csv I need to
insert is a text-file so...

Anyway this goes well but when I issue a simple query like the above it
throws an exception:
hive> select dailyuniquesitevisitorsample.* from
dailyuniquesitevisitorsample where dailyuniquesitevisitorsample.site=1;
Total MapReduce jobs = 1
Number of reduce tasks is set to 0 since there's no reduce operator
java.lang.AbstractMethodError:
org.apache.hadoop.hive.ql.io.HiveInputFormat.validateInput(Lorg/apache/hadoop/mapred/JobConf;)V
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:735)
        at
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:239)
        at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:174)
        at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:207)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:306)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
        at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
        at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)


Not sure that I am doing this correctly. Please guide me if I am stupid.

Kindly

//Marcus










-- 
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
marcus.herou@tailsweep.com
http://www.tailsweep.com/
http://blogg.tailsweep.com/

Re: Cannot issue query

Posted by Marcus Herou <ma...@tailsweep.com>.

That did it!

However I now get this from the jobtracker (by viewing the gui for a map
task):

java.lang.NullPointerException
	at org.apache.hadoop.hive.serde2.lazy.LazyStruct.parse(LazyStruct.java:103)
	at org.apache.hadoop.hive.serde2.lazy.LazyStruct.getField(LazyStruct.java:151)
	at org.apache.hadoop.hive.serde2.objectinspector.LazySimpleStructObjectInspector.getStructFieldData(LazySimpleStructObjectInspector.java:123)
	at org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:104)
	at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.evaluate(ExprNodeColumnEvaluator.java:53)
	at org.apache.hadoop.hive.ql.exec.ExprNodeFuncEvaluator.evaluate(ExprNodeFuncEvaluator.java:72)
	at org.apache.hadoop.hive.ql.exec.ExprNodeFuncEvaluator.evaluate(ExprNodeFuncEvaluator.java:72)
	at org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:63)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:306)
	at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:49)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:306)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:178)
	at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:71)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

This is the query:

hive> select d.* from DailyUniqueSiteVisitorSample d where d.site=1
and d.dt='2009-03-03';
Total MapReduce jobs = 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_200902111713_6047, Tracking URL =
http://mapredcoord:50030/jobdetails.jsp?jobid=job_200902111713_6047
Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
-Dmapred.job.tracker=hdfs://mapredcoord:9001/ -kill
job_200902111713_6047
 map = 0%,  reduce =0%
 map = 100%,  reduce =100%
Ended Job = job_200902111713_6047 with errors
FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.ExecDriver
Time taken: 24.805 seconds


On Wed, Mar 4, 2009 at 8:18 PM, Scott Carey <sc...@richrelevance.com> wrote:

>  Adding –Dhadoop.version=”0.18.2” to the ant build is not sufficient on
> its own.
>
> ALSO:
> Run ant clean
> Clean out the target dir
>
> then
> Build with the right hadoop version flag passed in.
>
> If you switch the hadoop version to build, you must run an ant clean for
> the change to take effect.
>
>
>
> On 3/4/09 3:17 AM, "Marcus Herou" <ma...@tailsweep.com> wrote:
>
> Hi Johan and thanks for the fast reply.
>
> It seemed like Hive did not find the method yes.
>
> That did not do it, same exception... I wiped the dist dir and
> dropped/created/loaded the data then issued the query again.
> Should I as well wipe some other local/hdfs dir ?
>
> Upgrading to 0.19.1 is not an option for some weeks.
>
> On Wed, Mar 4, 2009 at 11:59 AM, Johan Oskarsson <jo...@oskarsson.nu>
> wrote:
>
> Hi Marcus,
>
> It looks like you've hit on a Hadoop 0.18 vs 0.19 issue,
> try to compile Hive using: ant -Dhadoop.version="0.18.2" package
>
> That runs some preprocessing steps to remove 0.19 specific code from Hive.
>
> /Johan
>
> Marcus Herou wrote:
> > Hi.
> >
> > Started to lab with Hive today since it seems to suit us quite well and
> > since we are processing our weblogstats with Hadoop as of today and ends
> > up doing SQL in Hadoop form it seems fair to try out a system which does
> > that in one step :)
> >
> > I've created and loaded data into Hive with the following statements;
> > hive> drop table DailyUniqueSiteVisitorSample;
> > OK
> > Time taken: 4.064 seconds
> > hive> CREATE TABLE DailyUniqueSiteVisitorSample (sampleDate date,uid
> > bigint,site int,concreteStatistics int,network smallint,category
> > smallint,country smallint,countryCode String,sessions
> > smallint,pageImpressions smallint) COMMENT 'This is our weblog stats
> > table' PARTITIONED BY(dt STRING) ROW FORMAT DELIMITED FIELDS TERMINATED
> > BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE;
> > OK
> > Time taken: 0.248 seconds
> > hive> LOAD DATA LOCAL INPATH
> > '/tmp/data-DenormalizedSiteVisitor.VisitsPi.2009-03-02.csv' INTO TABLE
> > DailyUniqueSiteVisitorSample PARTITION(dt='2009-03-02');
> > Copying data from file:/tmp/data-2009-03-02.csv
> > Loading data to table dailyuniquesitevisitorsample partition
> {dt=2009-03-02}
> > OK
> > Time taken: 2.258 seconds
> >
> > A little confused about the text-file part but since the csv I need to
> > insert is a text-file so... (the tutorial only uses SequenceFile(s)),
> > seems to work though.
> >
> > Anyway this goes well but when I issue a simple query like the below it
> > throws an exception:
> > hive> select d.* from dailyuniquesitevisitorsample d where d.site=1;
> > Total MapReduce jobs = 1
> > Number of reduce tasks is set to 0 since there's no reduce operator
> > java.lang.AbstractMethodError:
> >
> org.apache.hadoop.hive.ql.io.HiveInputFormat.validateInput(Lorg/apache/hadoop/mapred/JobConf;)V
> >         at
> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:735)
> >         at
> > org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
> >         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:239)
> >         at
> > org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:174)
> >         at
> > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:207)
> >         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:306)
> >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >         at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >         at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >         at java.lang.reflect.Method.invoke(Method.java:597)
> >         at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
> >         at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
> >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >         at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
> >
> > I run Hadoop-018.2
> >
> > Not sure that I am doing this correctly. Please guide me if I am stupid.
> >
> > Kindly
> >
> > //Marcus
> >
> >
> > --
> > Marcus Herou CTO and co-founder Tailsweep AB
> > +46702561312
> > marcus.herou@tailsweep.com <ma...@tailsweep.com>
> >
> > http://www.tailsweep.com/
> > http://blogg.tailsweep.com/
>
>
>
>


-- 
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
marcus.herou@tailsweep.com
http://www.tailsweep.com/
http://blogg.tailsweep.com/

Re: Cannot issue query

Posted by Scott Carey <sc...@richrelevance.com>.

Adding -Dhadoop.version="0.18.2" to the ant build is not sufficient on its own.

ALSO:
Run ant clean
Clean out the target dir

then
Build with the right hadoop version flag passed in.

If you switch the hadoop version to build, you must run an ant clean for the change to take effect.


On 3/4/09 3:17 AM, "Marcus Herou" <ma...@tailsweep.com> wrote:

Hi Johan and thanks for the fast reply.

It seemed like Hive did not find the method yes.

That did not do it, same exception... I wiped the dist dir and dropped/created/loaded the data then issued the query again.
Should I as well wipe some other local/hdfs dir ?

Upgrading to 0.19.1 is not an option for some weeks.

On Wed, Mar 4, 2009 at 11:59 AM, Johan Oskarsson <jo...@oskarsson.nu> wrote:
Hi Marcus,

It looks like you've hit on a Hadoop 0.18 vs 0.19 issue,
try to compile Hive using: ant -Dhadoop.version="0.18.2" package

That runs some preprocessing steps to remove 0.19 specific code from Hive.

/Johan

Marcus Herou wrote:
> Hi.
>
> Started to lab with Hive today since it seems to suit us quite well and
> since we are processing our weblogstats with Hadoop as of today and ends
> up doing SQL in Hadoop form it seems fair to try out a system which does
> that in one step :)
>
> I've created and loaded data into Hive with the following statements;
> hive> drop table DailyUniqueSiteVisitorSample;
> OK
> Time taken: 4.064 seconds
> hive> CREATE TABLE DailyUniqueSiteVisitorSample (sampleDate date,uid
> bigint,site int,concreteStatistics int,network smallint,category
> smallint,country smallint,countryCode String,sessions
> smallint,pageImpressions smallint) COMMENT 'This is our weblog stats
> table' PARTITIONED BY(dt STRING) ROW FORMAT DELIMITED FIELDS TERMINATED
> BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE;
> OK
> Time taken: 0.248 seconds
> hive> LOAD DATA LOCAL INPATH
> '/tmp/data-DenormalizedSiteVisitor.VisitsPi.2009-03-02.csv' INTO TABLE
> DailyUniqueSiteVisitorSample PARTITION(dt='2009-03-02');
> Copying data from file:/tmp/data-2009-03-02.csv
> Loading data to table dailyuniquesitevisitorsample partition {dt=2009-03-02}
> OK
> Time taken: 2.258 seconds
>
> A little confused about the text-file part but since the csv I need to
> insert is a text-file so... (the tutorial only uses SequenceFile(s)),
> seems to work though.
>
> Anyway this goes well but when I issue a simple query like the below it
> throws an exception:
> hive> select d.* from dailyuniquesitevisitorsample d where d.site=1;
> Total MapReduce jobs = 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> java.lang.AbstractMethodError:
> org.apache.hadoop.hive.ql.io.HiveInputFormat.validateInput(Lorg/apache/hadoop/mapred/JobConf;)V
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:735)
>         at
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:239)
>         at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:174)
>         at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:207)
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:306)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
>         at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>         at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
>
> I run Hadoop-018.2
>
> Not sure that I am doing this correctly. Please guide me if I am stupid.
>
> Kindly
>
> //Marcus
>
>
> --
> Marcus Herou CTO and co-founder Tailsweep AB
> +46702561312
> marcus.herou@tailsweep.com <ma...@tailsweep.com>
> http://www.tailsweep.com/
> http://blogg.tailsweep.com/

Re: Cannot issue query

Posted by Marcus Herou <ma...@tailsweep.com>.

Hi Johan and thanks for the fast reply.

It seemed like Hive did not find the method yes.

That did not do it, same exception... I wiped the dist dir and
dropped/created/loaded the data then issued the query again.
Should I as well wipe some other local/hdfs dir ?

Upgrading to 0.19.1 is not an option for some weeks.

On Wed, Mar 4, 2009 at 11:59 AM, Johan Oskarsson <jo...@oskarsson.nu> wrote:

> Hi Marcus,
>
> It looks like you've hit on a Hadoop 0.18 vs 0.19 issue,
> try to compile Hive using: ant -Dhadoop.version="0.18.2" package
>
> That runs some preprocessing steps to remove 0.19 specific code from Hive.
>
> /Johan
>
> Marcus Herou wrote:
> > Hi.
> >
> > Started to lab with Hive today since it seems to suit us quite well and
> > since we are processing our weblogstats with Hadoop as of today and ends
> > up doing SQL in Hadoop form it seems fair to try out a system which does
> > that in one step :)
> >
> > I've created and loaded data into Hive with the following statements;
> > hive> drop table DailyUniqueSiteVisitorSample;
> > OK
> > Time taken: 4.064 seconds
> > hive> CREATE TABLE DailyUniqueSiteVisitorSample (sampleDate date,uid
> > bigint,site int,concreteStatistics int,network smallint,category
> > smallint,country smallint,countryCode String,sessions
> > smallint,pageImpressions smallint) COMMENT 'This is our weblog stats
> > table' PARTITIONED BY(dt STRING) ROW FORMAT DELIMITED FIELDS TERMINATED
> > BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE;
> > OK
> > Time taken: 0.248 seconds
> > hive> LOAD DATA LOCAL INPATH
> > '/tmp/data-DenormalizedSiteVisitor.VisitsPi.2009-03-02.csv' INTO TABLE
> > DailyUniqueSiteVisitorSample PARTITION(dt='2009-03-02');
> > Copying data from file:/tmp/data-2009-03-02.csv
> > Loading data to table dailyuniquesitevisitorsample partition
> {dt=2009-03-02}
> > OK
> > Time taken: 2.258 seconds
> >
> > A little confused about the text-file part but since the csv I need to
> > insert is a text-file so... (the tutorial only uses SequenceFile(s)),
> > seems to work though.
> >
> > Anyway this goes well but when I issue a simple query like the below it
> > throws an exception:
> > hive> select d.* from dailyuniquesitevisitorsample d where d.site=1;
> > Total MapReduce jobs = 1
> > Number of reduce tasks is set to 0 since there's no reduce operator
> > java.lang.AbstractMethodError:
> >
> org.apache.hadoop.hive.ql.io.HiveInputFormat.validateInput(Lorg/apache/hadoop/mapred/JobConf;)V
> >         at
> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:735)
> >         at
> > org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
> >         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:239)
> >         at
> > org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:174)
> >         at
> > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:207)
> >         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:306)
> >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >         at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >         at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >         at java.lang.reflect.Method.invoke(Method.java:597)
> >         at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
> >         at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
> >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >         at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
> >
> > I run Hadoop-018.2
> >
> > Not sure that I am doing this correctly. Please guide me if I am stupid.
> >
> > Kindly
> >
> > //Marcus
> >
> >
> > --
> > Marcus Herou CTO and co-founder Tailsweep AB
> > +46702561312
> > marcus.herou@tailsweep.com <ma...@tailsweep.com>
> > http://www.tailsweep.com/
> > http://blogg.tailsweep.com/
>
>


-- 
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
marcus.herou@tailsweep.com
http://www.tailsweep.com/
http://blogg.tailsweep.com/

Re: Cannot issue query

Posted by Johan Oskarsson <jo...@oskarsson.nu>.

Hi Marcus,

It looks like you've hit on a Hadoop 0.18 vs 0.19 issue,
try to compile Hive using: ant -Dhadoop.version="0.18.2" package

That runs some preprocessing steps to remove 0.19 specific code from Hive.

/Johan

Marcus Herou wrote:
> Hi.
> 
> Started to lab with Hive today since it seems to suit us quite well and
> since we are processing our weblogstats with Hadoop as of today and ends
> up doing SQL in Hadoop form it seems fair to try out a system which does
> that in one step :)
> 
> I've created and loaded data into Hive with the following statements;
> hive> drop table DailyUniqueSiteVisitorSample;
> OK
> Time taken: 4.064 seconds
> hive> CREATE TABLE DailyUniqueSiteVisitorSample (sampleDate date,uid
> bigint,site int,concreteStatistics int,network smallint,category
> smallint,country smallint,countryCode String,sessions
> smallint,pageImpressions smallint) COMMENT 'This is our weblog stats
> table' PARTITIONED BY(dt STRING) ROW FORMAT DELIMITED FIELDS TERMINATED
> BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE;
> OK
> Time taken: 0.248 seconds
> hive> LOAD DATA LOCAL INPATH
> '/tmp/data-DenormalizedSiteVisitor.VisitsPi.2009-03-02.csv' INTO TABLE
> DailyUniqueSiteVisitorSample PARTITION(dt='2009-03-02');
> Copying data from file:/tmp/data-2009-03-02.csv
> Loading data to table dailyuniquesitevisitorsample partition {dt=2009-03-02}
> OK
> Time taken: 2.258 seconds
> 
> A little confused about the text-file part but since the csv I need to
> insert is a text-file so... (the tutorial only uses SequenceFile(s)),
> seems to work though.
> 
> Anyway this goes well but when I issue a simple query like the below it
> throws an exception:
> hive> select d.* from dailyuniquesitevisitorsample d where d.site=1;
> Total MapReduce jobs = 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> java.lang.AbstractMethodError:
> org.apache.hadoop.hive.ql.io.HiveInputFormat.validateInput(Lorg/apache/hadoop/mapred/JobConf;)V
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:735)
>         at
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:239)
>         at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:174)
>         at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:207)
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:306)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
>         at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>         at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
> 
> I run Hadoop-018.2
> 
> Not sure that I am doing this correctly. Please guide me if I am stupid.
> 
> Kindly
> 
> //Marcus
> 
> 
> -- 
> Marcus Herou CTO and co-founder Tailsweep AB
> +46702561312
> marcus.herou@tailsweep.com <ma...@tailsweep.com>
> http://www.tailsweep.com/
> http://blogg.tailsweep.com/

Cannot issue query

Posted by Marcus Herou <ma...@tailsweep.com>.

Hi.

Started to lab with Hive today since it seems to suit us quite well and
since we are processing our weblogstats with Hadoop as of today and ends up
doing SQL in Hadoop form it seems fair to try out a system which does that
in one step :)

I've created and loaded data into Hive with the following statements;
hive> drop table DailyUniqueSiteVisitorSample;
OK
Time taken: 4.064 seconds
hive> CREATE TABLE DailyUniqueSiteVisitorSample (sampleDate date,uid
bigint,site int,concreteStatistics int,network smallint,category
smallint,country smallint,countryCode String,sessions
smallint,pageImpressions smallint) COMMENT 'This is our weblog stats table'
PARTITIONED BY(dt STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n' STORED AS TEXTFILE;
OK
Time taken: 0.248 seconds
hive> LOAD DATA LOCAL INPATH
'/tmp/data-DenormalizedSiteVisitor.VisitsPi.2009-03-02.csv' INTO TABLE
DailyUniqueSiteVisitorSample PARTITION(dt='2009-03-02');
Copying data from file:/tmp/data-2009-03-02.csv
Loading data to table dailyuniquesitevisitorsample partition {dt=2009-03-02}
OK
Time taken: 2.258 seconds

A little confused about the text-file part but since the csv I need to
insert is a text-file so... (the tutorial only uses SequenceFile(s)), seems
to work though.

Anyway this goes well but when I issue a simple query like the below it
throws an exception:
hive> select d.* from dailyuniquesitevisitorsample d where d.site=1;
Total MapReduce jobs = 1
Number of reduce tasks is set to 0 since there's no reduce operator
java.lang.AbstractMethodError:
org.apache.hadoop.hive.ql.io.HiveInputFormat.validateInput(Lorg/apache/hadoop/mapred/JobConf;)V
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:735)
        at
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:239)
        at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:174)
        at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:207)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:306)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
        at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
        at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

I run Hadoop-018.2

Not sure that I am doing this correctly. Please guide me if I am stupid.

Kindly

//Marcus


-- 
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
marcus.herou@tailsweep.com
http://www.tailsweep.com/
http://blogg.tailsweep.com/