You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by James Ram <hb...@gmail.com> on 2011/06/15 08:29:51 UTC

Difficulty using importtsv tool

Hi,

I'm having trouble with using the  importtsv tool.
I ran the following command:

hadoop jar hadoop_sws/hbase-0.90.0/hbase-0.90.0.jar  importtsv
-Dimporttsv.columns=HBASE_ROW_KEY ,b_info:name, b_info:contactNo,
b_info:dob, b_info:email, b_info:marital_status, b_info:p_address,
b_info:photo, demo_info:caste, demo_info:gender, demo_info:religion,
beneficiarydetails user/sao_user/UserProfile.tsv


And I got the following error:
ERROR: One or more columns in addition to the row key are required
Usage: importtsv -Dimporttsv.columns=a,b,c <tablename> <inputdir>

I can't really find out the problem in the way I wrote the command. I have
added the column names after the HBASE_ROW_KEY.
Where did I go wrong?

-- 
With Regards,
Jr.

Re: Difficulty using importtsv tool

Posted by Jean-Daniel Cryans <jd...@apache.org>.
It's reading from the raw filesystem, meaning you are using the local
job tracker? Like Todd said here:
http://search-hadoop.com/m/4qiKjU9S6e

You need to make sure that HBase knows about the mapred-site.xml file
but altering it's classpath. More here:
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#classpath

J-D

On Wed, Jun 15, 2011 at 11:48 PM, James Ram <hb...@gmail.com> wrote:
> Hi,
>
> Well, I removed all the spaces between the column names and it worked.
>
> Another problem I'm facing now is that when I use the importtsv.bulk.output
> option it throws another error:
>
>  INFO mapred.JobClient: Task Id : attempt_201106161041_0003_m_000000_2,
> Status : FAILED
> java.lang.IllegalArgumentException: Can't read partitions file
>         at
> org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:111)
>         at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
>         at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>         at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:527)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:613)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>         at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.io.FileNotFoundException: File _partition.lst does not
> exist.
>         at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
>         at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)
>         at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:676)
>         at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
>         at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412)
>         at
> org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.readPartitions(TotalOrderPartitioner.java:296)
>
> I am running a 5 machine cluster and all the mapred.xml files include the
> jobtracker address. Also, I have attached the mapred-site.xml with this
> mail.
> I tried to validate this file with xmllint and I got the error
> "mapred-site.xml:4: validity error : Validation failed: no DTD found !"
> The mapred-default.xml doesn't look much different from the one I have
> attached. The mapred-default.xml doesn't seem to include a dtd either.
>
> Thanks,
> JR
>
>
>
>
> On Wed, Jun 15, 2011 at 12:13 PM, Prashant Sharma
> <me...@gmail.com> wrote:
>>
>> Did you write the table name ? and remove an extra space after hbase row
>> key. I think that must be th reason
>> ( I am not an expert , but have struggled alot with it. )
>> Thanks,
>> Prashant
>> On Wed, Jun 15, 2011 at 11:59 AM, James Ram <hb...@gmail.com> wrote:
>>
>> > Hi,
>> >
>> > I'm having trouble with using the  importtsv tool.
>> > I ran the following command:
>> >
>> > hadoop jar hadoop_sws/hbase-0.90.0/hbase-0.90.0.jar  importtsv
>> > -Dimporttsv.columns=HBASE_ROW_KEY ,b_info:name, b_info:contactNo,
>> > b_info:dob, b_info:email, b_info:marital_status, b_info:p_address,
>> > b_info:photo, demo_info:caste, demo_info:gender, demo_info:religion,
>> > beneficiarydetails user/sao_user/UserProfile.tsv
>> >
>> >
>> > And I got the following error:
>> > ERROR: One or more columns in addition to the row key are required
>> > Usage: importtsv -Dimporttsv.columns=a,b,c <tablename> <inputdir>
>> >
>> > I can't really find out the problem in the way I wrote the command. I
>> > have
>> > added the column names after the HBASE_ROW_KEY.
>> > Where did I go wrong?
>> >
>> > --
>> > With Regards,
>> > Jr.
>> >
>
>
>
> --
> With Regards,
> Jr.
>

Re: Difficulty using importtsv tool

Posted by Stack <st...@duboce.net>.
Can you find the below in your filesystem?

Caused by: java.io.FileNotFoundException: File _partition.lst does not exist.

Its not able to locate it.  Does it exist?  Do you see logs about it
being written earlier?

St.Ack

On Wed, Jun 15, 2011 at 11:48 PM, James Ram <hb...@gmail.com> wrote:
> Hi,
>
> Well, I removed all the spaces between the column names and it worked.
>
> Another problem I'm facing now is that when I use the importtsv.bulk.output
> option it throws another error:
>
>  INFO mapred.JobClient: Task Id : attempt_201106161041_0003_m_000000_2,
> Status : FAILED
> java.lang.IllegalArgumentException: Can't read partitions file
>         at
> org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:111)
>         at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
>         at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>         at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:527)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:613)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>         at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.io.FileNotFoundException: File _partition.lst does not
> exist.
>         at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
>         at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)
>         at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:676)
>         at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
>         at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412)
>         at
> org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.readPartitions(TotalOrderPartitioner.java:296)
>
> I am running a 5 machine cluster and all the mapred.xml files include the
> jobtracker address. Also, I have attached the mapred-site.xml with this
> mail.
> I tried to validate this file with xmllint and I got the error
> "mapred-site.xml:4: validity error : Validation failed: no DTD found !"
> The mapred-default.xml doesn't look much different from the one I have
> attached. The mapred-default.xml doesn't seem to include a dtd either.
>
> Thanks,
> JR
>
>
>
>
> On Wed, Jun 15, 2011 at 12:13 PM, Prashant Sharma
> <me...@gmail.com> wrote:
>>
>> Did you write the table name ? and remove an extra space after hbase row
>> key. I think that must be th reason
>> ( I am not an expert , but have struggled alot with it. )
>> Thanks,
>> Prashant
>> On Wed, Jun 15, 2011 at 11:59 AM, James Ram <hb...@gmail.com> wrote:
>>
>> > Hi,
>> >
>> > I'm having trouble with using the  importtsv tool.
>> > I ran the following command:
>> >
>> > hadoop jar hadoop_sws/hbase-0.90.0/hbase-0.90.0.jar  importtsv
>> > -Dimporttsv.columns=HBASE_ROW_KEY ,b_info:name, b_info:contactNo,
>> > b_info:dob, b_info:email, b_info:marital_status, b_info:p_address,
>> > b_info:photo, demo_info:caste, demo_info:gender, demo_info:religion,
>> > beneficiarydetails user/sao_user/UserProfile.tsv
>> >
>> >
>> > And I got the following error:
>> > ERROR: One or more columns in addition to the row key are required
>> > Usage: importtsv -Dimporttsv.columns=a,b,c <tablename> <inputdir>
>> >
>> > I can't really find out the problem in the way I wrote the command. I
>> > have
>> > added the column names after the HBASE_ROW_KEY.
>> > Where did I go wrong?
>> >
>> > --
>> > With Regards,
>> > Jr.
>> >
>
>
>
> --
> With Regards,
> Jr.
>

Re: Difficulty using importtsv tool

Posted by James Ram <hb...@gmail.com>.
Hi,

Well, I removed all the spaces between the column names and it worked.

Another problem I'm facing now is that when I use the importtsv.bulk.output
option it throws another error:

 INFO mapred.JobClient: Task Id : attempt_201106161041_0003_m_000000_2,
Status : FAILED
java.lang.IllegalArgumentException: Can't read partitions file
        at
org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:111)
        at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
        at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
        at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:527)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:613)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.io.FileNotFoundException: File _partition.lst does not
exist.
        at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
        at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)
        at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:676)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412)
        at
org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.readPartitions(TotalOrderPartitioner.java:296)

*I* am running a 5 machine cluster and all the mapred.xml files include the
jobtracker address. Also, I have attached the mapred-site.xml with this
mail.
I tried to validate this file with xmllint and I got the error
"mapred-site.xml:4: validity error : Validation failed: no DTD found !"
The mapred-default.xml doesn't look much different from the one I have
attached. The mapred-default.xml doesn't seem to include a dtd either.

Thanks,
JR




On Wed, Jun 15, 2011 at 12:13 PM, Prashant Sharma <meetprashant007@gmail.com
> wrote:

> Did you write the table name ? and remove an extra space after hbase row
> key. I think that must be th reason
> ( I am not an expert , but have struggled alot with it. )
> Thanks,
> Prashant
> On Wed, Jun 15, 2011 at 11:59 AM, James Ram <hb...@gmail.com> wrote:
>
> > Hi,
> >
> > I'm having trouble with using the  importtsv tool.
> > I ran the following command:
> >
> > hadoop jar hadoop_sws/hbase-0.90.0/hbase-0.90.0.jar  importtsv
> > -Dimporttsv.columns=HBASE_ROW_KEY ,b_info:name, b_info:contactNo,
> > b_info:dob, b_info:email, b_info:marital_status, b_info:p_address,
> > b_info:photo, demo_info:caste, demo_info:gender, demo_info:religion,
> > beneficiarydetails user/sao_user/UserProfile.tsv
> >
> >
> > And I got the following error:
> > ERROR: One or more columns in addition to the row key are required
> > Usage: importtsv -Dimporttsv.columns=a,b,c <tablename> <inputdir>
> >
> > I can't really find out the problem in the way I wrote the command. I
> have
> > added the column names after the HBASE_ROW_KEY.
> > Where did I go wrong?
> >
> > --
> > With Regards,
> > Jr.
> >
>



-- 
With Regards,
Jr.

Re: Difficulty using importtsv tool

Posted by Prashant Sharma <me...@gmail.com>.
Did you write the table name ? and remove an extra space after hbase row
key. I think that must be th reason
( I am not an expert , but have struggled alot with it. )
Thanks,
Prashant
On Wed, Jun 15, 2011 at 11:59 AM, James Ram <hb...@gmail.com> wrote:

> Hi,
>
> I'm having trouble with using the  importtsv tool.
> I ran the following command:
>
> hadoop jar hadoop_sws/hbase-0.90.0/hbase-0.90.0.jar  importtsv
> -Dimporttsv.columns=HBASE_ROW_KEY ,b_info:name, b_info:contactNo,
> b_info:dob, b_info:email, b_info:marital_status, b_info:p_address,
> b_info:photo, demo_info:caste, demo_info:gender, demo_info:religion,
> beneficiarydetails user/sao_user/UserProfile.tsv
>
>
> And I got the following error:
> ERROR: One or more columns in addition to the row key are required
> Usage: importtsv -Dimporttsv.columns=a,b,c <tablename> <inputdir>
>
> I can't really find out the problem in the way I wrote the command. I have
> added the column names after the HBASE_ROW_KEY.
> Where did I go wrong?
>
> --
> With Regards,
> Jr.
>

Re: Difficulty using importtsv tool

Posted by James Ram <hb...@gmail.com>.
Hi,

Removing spaces worked!!

Bill, thanks a lot  for taking the time to look into this issue !

Regards,
JR

On Wed, Jun 15, 2011 at 12:21 PM, Bill Graham <bi...@gmail.com> wrote:

> Try removing the spaces in the column list, i.e. commas only.
>
> On Tue, Jun 14, 2011 at 11:29 PM, James Ram <hb...@gmail.com> wrote:
>
> > Hi,
> >
> > I'm having trouble with using the  importtsv tool.
> > I ran the following command:
> >
> > hadoop jar hadoop_sws/hbase-0.90.0/hbase-0.90.0.jar  importtsv
> > -Dimporttsv.columns=HBASE_ROW_KEY ,b_info:name, b_info:contactNo,
> > b_info:dob, b_info:email, b_info:marital_status, b_info:p_address,
> > b_info:photo, demo_info:caste, demo_info:gender, demo_info:religion,
> > beneficiarydetails user/sao_user/UserProfile.tsv
> >
> >
> > And I got the following error:
> > ERROR: One or more columns in addition to the row key are required
> > Usage: importtsv -Dimporttsv.columns=a,b,c <tablename> <inputdir>
> >
> > I can't really find out the problem in the way I wrote the command. I
> have
> > added the column names after the HBASE_ROW_KEY.
> > Where did I go wrong?
> >
> > --
> > With Regards,
> > Jr.
> >
>



-- 
With Regards,
Jr.

Re: Difficulty using importtsv tool

Posted by Bill Graham <bi...@gmail.com>.
Try removing the spaces in the column list, i.e. commas only.

On Tue, Jun 14, 2011 at 11:29 PM, James Ram <hb...@gmail.com> wrote:

> Hi,
>
> I'm having trouble with using the  importtsv tool.
> I ran the following command:
>
> hadoop jar hadoop_sws/hbase-0.90.0/hbase-0.90.0.jar  importtsv
> -Dimporttsv.columns=HBASE_ROW_KEY ,b_info:name, b_info:contactNo,
> b_info:dob, b_info:email, b_info:marital_status, b_info:p_address,
> b_info:photo, demo_info:caste, demo_info:gender, demo_info:religion,
> beneficiarydetails user/sao_user/UserProfile.tsv
>
>
> And I got the following error:
> ERROR: One or more columns in addition to the row key are required
> Usage: importtsv -Dimporttsv.columns=a,b,c <tablename> <inputdir>
>
> I can't really find out the problem in the way I wrote the command. I have
> added the column names after the HBASE_ROW_KEY.
> Where did I go wrong?
>
> --
> With Regards,
> Jr.
>