You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by ashish singhi <as...@huawei.com> on 2014/04/07 10:25:58 UTC

One question regarding bulk load

Hi all.

I have one question regarding bulk load.
How to load data with table empty column values in few rows using bulk load tool ?

I tried the following simple example in HBase 0.94.11 and Hadoop-2, with table having three columns and second column value is empty in few rows using bulk load tool.


Ø  Data in file is in below format

row0,value1,value0

row1,,value1

row2,value3,value2

row3,,value3

row4,value5,value4

row5,,value5

row6,value7,value6

row7,,value7

row8,value9,value8



Ø  When I execute the command

hadoop jar <HBASE_HOME>/hbase-0.94.11-security.jar importtsv -Dimporttsv.skip.bad.lines=false -Dimporttsv.separator=, -Dimporttsv.columns=HBASE_ROW_KEY,cf1:c1,cf1:c2 -Dimporttsv.bulk.output= /bulkdata/comma_separated _3columns comma_separated_3columns /comma_separated_ 3columns.txt



I get the below Exception.



2014-04-07 11:15:01,870 INFO  [main] mapreduce.Job (Job.java:printTaskEvents(1424)) - Task Id : attempt_1396526639698_0028_m_000000_2, Status : FAILED

Error: java.io.IOException: org.apache.hadoop.hbase.mapreduce.ImportTsv$TsvParser$BadTsvLineException: No delimiter

        at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:135)

        at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:33)

        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)

        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)

Regards,
Ashish Singhi

RE: One question regarding bulk load

Posted by ashish singhi <as...@huawei.com>.
Yes. Thanks Kashif for pointing it out. There was an empty line at the end of the file.

Regards
Ashish
-----Original Message-----
From: Kashif Jawed Siddiqui [mailto:kashifjs@huawei.com] 
Sent: 07 April 2014 15:28
To: user@hbase.apache.org
Cc: dev@hbase.apache.org
Subject: RE: One question regarding bulk load

Hi,

	Please check if your file contains empty lines(maybe in the beginning or the end).

	Since -Dimporttsv.skip.bad.lines=false is set, any empty lines will cause this error.

Regards
KASHIF

-----Original Message-----
From: ashish singhi [mailto:ashish.singhi@huawei.com] 
Sent: 07 April 2014 13:56
To: user@hbase.apache.org
Cc: dev@hbase.apache.org
Subject: One question regarding bulk load

Hi all.

I have one question regarding bulk load.
How to load data with table empty column values in few rows using bulk load tool ?

I tried the following simple example in HBase 0.94.11 and Hadoop-2, with table having three columns and second column value is empty in few rows using bulk load tool.


Ø  Data in file is in below format

row0,value1,value0

row1,,value1

row2,value3,value2

row3,,value3

row4,value5,value4

row5,,value5

row6,value7,value6

row7,,value7

row8,value9,value8



Ø  When I execute the command

hadoop jar <HBASE_HOME>/hbase-0.94.11-security.jar importtsv -Dimporttsv.skip.bad.lines=false -Dimporttsv.separator=, -Dimporttsv.columns=HBASE_ROW_KEY,cf1:c1,cf1:c2 -Dimporttsv.bulk.output= /bulkdata/comma_separated _3columns comma_separated_3columns /comma_separated_ 3columns.txt



I get the below Exception.



2014-04-07 11:15:01,870 INFO  [main] mapreduce.Job (Job.java:printTaskEvents(1424)) - Task Id : attempt_1396526639698_0028_m_000000_2, Status : FAILED

Error: java.io.IOException: org.apache.hadoop.hbase.mapreduce.ImportTsv$TsvParser$BadTsvLineException: No delimiter

        at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:135)

        at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:33)

        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)

        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)

Regards,
Ashish Singhi

RE: One question regarding bulk load

Posted by ashish singhi <as...@huawei.com>.
Yes. Thanks Kashif for pointing it out. There was an empty line at the end of the file.

Regards
Ashish
-----Original Message-----
From: Kashif Jawed Siddiqui [mailto:kashifjs@huawei.com] 
Sent: 07 April 2014 15:28
To: user@hbase.apache.org
Cc: dev@hbase.apache.org
Subject: RE: One question regarding bulk load

Hi,

	Please check if your file contains empty lines(maybe in the beginning or the end).

	Since -Dimporttsv.skip.bad.lines=false is set, any empty lines will cause this error.

Regards
KASHIF

-----Original Message-----
From: ashish singhi [mailto:ashish.singhi@huawei.com] 
Sent: 07 April 2014 13:56
To: user@hbase.apache.org
Cc: dev@hbase.apache.org
Subject: One question regarding bulk load

Hi all.

I have one question regarding bulk load.
How to load data with table empty column values in few rows using bulk load tool ?

I tried the following simple example in HBase 0.94.11 and Hadoop-2, with table having three columns and second column value is empty in few rows using bulk load tool.


Ø  Data in file is in below format

row0,value1,value0

row1,,value1

row2,value3,value2

row3,,value3

row4,value5,value4

row5,,value5

row6,value7,value6

row7,,value7

row8,value9,value8



Ø  When I execute the command

hadoop jar <HBASE_HOME>/hbase-0.94.11-security.jar importtsv -Dimporttsv.skip.bad.lines=false -Dimporttsv.separator=, -Dimporttsv.columns=HBASE_ROW_KEY,cf1:c1,cf1:c2 -Dimporttsv.bulk.output= /bulkdata/comma_separated _3columns comma_separated_3columns /comma_separated_ 3columns.txt



I get the below Exception.



2014-04-07 11:15:01,870 INFO  [main] mapreduce.Job (Job.java:printTaskEvents(1424)) - Task Id : attempt_1396526639698_0028_m_000000_2, Status : FAILED

Error: java.io.IOException: org.apache.hadoop.hbase.mapreduce.ImportTsv$TsvParser$BadTsvLineException: No delimiter

        at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:135)

        at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:33)

        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)

        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)

Regards,
Ashish Singhi

RE: One question regarding bulk load

Posted by Kashif Jawed Siddiqui <ka...@huawei.com>.
Hi,

	Please check if your file contains empty lines(maybe in the beginning or the end).

	Since -Dimporttsv.skip.bad.lines=false is set, any empty lines will cause this error.

Regards
KASHIF

-----Original Message-----
From: ashish singhi [mailto:ashish.singhi@huawei.com] 
Sent: 07 April 2014 13:56
To: user@hbase.apache.org
Cc: dev@hbase.apache.org
Subject: One question regarding bulk load

Hi all.

I have one question regarding bulk load.
How to load data with table empty column values in few rows using bulk load tool ?

I tried the following simple example in HBase 0.94.11 and Hadoop-2, with table having three columns and second column value is empty in few rows using bulk load tool.


Ø  Data in file is in below format

row0,value1,value0

row1,,value1

row2,value3,value2

row3,,value3

row4,value5,value4

row5,,value5

row6,value7,value6

row7,,value7

row8,value9,value8



Ø  When I execute the command

hadoop jar <HBASE_HOME>/hbase-0.94.11-security.jar importtsv -Dimporttsv.skip.bad.lines=false -Dimporttsv.separator=, -Dimporttsv.columns=HBASE_ROW_KEY,cf1:c1,cf1:c2 -Dimporttsv.bulk.output= /bulkdata/comma_separated _3columns comma_separated_3columns /comma_separated_ 3columns.txt



I get the below Exception.



2014-04-07 11:15:01,870 INFO  [main] mapreduce.Job (Job.java:printTaskEvents(1424)) - Task Id : attempt_1396526639698_0028_m_000000_2, Status : FAILED

Error: java.io.IOException: org.apache.hadoop.hbase.mapreduce.ImportTsv$TsvParser$BadTsvLineException: No delimiter

        at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:135)

        at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:33)

        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)

        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)

Regards,
Ashish Singhi

RE: One question regarding bulk load

Posted by Kashif Jawed Siddiqui <ka...@huawei.com>.
Hi,

	Please check if your file contains empty lines(maybe in the beginning or the end).

	Since -Dimporttsv.skip.bad.lines=false is set, any empty lines will cause this error.

Regards
KASHIF

-----Original Message-----
From: ashish singhi [mailto:ashish.singhi@huawei.com] 
Sent: 07 April 2014 13:56
To: user@hbase.apache.org
Cc: dev@hbase.apache.org
Subject: One question regarding bulk load

Hi all.

I have one question regarding bulk load.
How to load data with table empty column values in few rows using bulk load tool ?

I tried the following simple example in HBase 0.94.11 and Hadoop-2, with table having three columns and second column value is empty in few rows using bulk load tool.


Ø  Data in file is in below format

row0,value1,value0

row1,,value1

row2,value3,value2

row3,,value3

row4,value5,value4

row5,,value5

row6,value7,value6

row7,,value7

row8,value9,value8



Ø  When I execute the command

hadoop jar <HBASE_HOME>/hbase-0.94.11-security.jar importtsv -Dimporttsv.skip.bad.lines=false -Dimporttsv.separator=, -Dimporttsv.columns=HBASE_ROW_KEY,cf1:c1,cf1:c2 -Dimporttsv.bulk.output= /bulkdata/comma_separated _3columns comma_separated_3columns /comma_separated_ 3columns.txt



I get the below Exception.



2014-04-07 11:15:01,870 INFO  [main] mapreduce.Job (Job.java:printTaskEvents(1424)) - Task Id : attempt_1396526639698_0028_m_000000_2, Status : FAILED

Error: java.io.IOException: org.apache.hadoop.hbase.mapreduce.ImportTsv$TsvParser$BadTsvLineException: No delimiter

        at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:135)

        at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:33)

        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)

        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)

Regards,
Ashish Singhi