You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by ashish singhi <as...@huawei.com> on 2014/04/07 10:25:58 UTC
One question regarding bulk load
Hi all.
I have one question regarding bulk load.
How to load data with table empty column values in few rows using bulk load tool ?
I tried the following simple example in HBase 0.94.11 and Hadoop-2, with table having three columns and second column value is empty in few rows using bulk load tool.
Ø Data in file is in below format
row0,value1,value0
row1,,value1
row2,value3,value2
row3,,value3
row4,value5,value4
row5,,value5
row6,value7,value6
row7,,value7
row8,value9,value8
Ø When I execute the command
hadoop jar <HBASE_HOME>/hbase-0.94.11-security.jar importtsv -Dimporttsv.skip.bad.lines=false -Dimporttsv.separator=, -Dimporttsv.columns=HBASE_ROW_KEY,cf1:c1,cf1:c2 -Dimporttsv.bulk.output= /bulkdata/comma_separated _3columns comma_separated_3columns /comma_separated_ 3columns.txt
I get the below Exception.
2014-04-07 11:15:01,870 INFO [main] mapreduce.Job (Job.java:printTaskEvents(1424)) - Task Id : attempt_1396526639698_0028_m_000000_2, Status : FAILED
Error: java.io.IOException: org.apache.hadoop.hbase.mapreduce.ImportTsv$TsvParser$BadTsvLineException: No delimiter
at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:135)
at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:33)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
Regards,
Ashish Singhi
RE: One question regarding bulk load
Posted by ashish singhi <as...@huawei.com>.
Yes. Thanks Kashif for pointing it out. There was an empty line at the end of the file.
Regards
Ashish
-----Original Message-----
From: Kashif Jawed Siddiqui [mailto:kashifjs@huawei.com]
Sent: 07 April 2014 15:28
To: user@hbase.apache.org
Cc: dev@hbase.apache.org
Subject: RE: One question regarding bulk load
Hi,
Please check if your file contains empty lines(maybe in the beginning or the end).
Since -Dimporttsv.skip.bad.lines=false is set, any empty lines will cause this error.
Regards
KASHIF
-----Original Message-----
From: ashish singhi [mailto:ashish.singhi@huawei.com]
Sent: 07 April 2014 13:56
To: user@hbase.apache.org
Cc: dev@hbase.apache.org
Subject: One question regarding bulk load
Hi all.
I have one question regarding bulk load.
How to load data with table empty column values in few rows using bulk load tool ?
I tried the following simple example in HBase 0.94.11 and Hadoop-2, with table having three columns and second column value is empty in few rows using bulk load tool.
Ø Data in file is in below format
row0,value1,value0
row1,,value1
row2,value3,value2
row3,,value3
row4,value5,value4
row5,,value5
row6,value7,value6
row7,,value7
row8,value9,value8
Ø When I execute the command
hadoop jar <HBASE_HOME>/hbase-0.94.11-security.jar importtsv -Dimporttsv.skip.bad.lines=false -Dimporttsv.separator=, -Dimporttsv.columns=HBASE_ROW_KEY,cf1:c1,cf1:c2 -Dimporttsv.bulk.output= /bulkdata/comma_separated _3columns comma_separated_3columns /comma_separated_ 3columns.txt
I get the below Exception.
2014-04-07 11:15:01,870 INFO [main] mapreduce.Job (Job.java:printTaskEvents(1424)) - Task Id : attempt_1396526639698_0028_m_000000_2, Status : FAILED
Error: java.io.IOException: org.apache.hadoop.hbase.mapreduce.ImportTsv$TsvParser$BadTsvLineException: No delimiter
at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:135)
at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:33)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
Regards,
Ashish Singhi
RE: One question regarding bulk load
Posted by ashish singhi <as...@huawei.com>.
Yes. Thanks Kashif for pointing it out. There was an empty line at the end of the file.
Regards
Ashish
-----Original Message-----
From: Kashif Jawed Siddiqui [mailto:kashifjs@huawei.com]
Sent: 07 April 2014 15:28
To: user@hbase.apache.org
Cc: dev@hbase.apache.org
Subject: RE: One question regarding bulk load
Hi,
Please check if your file contains empty lines(maybe in the beginning or the end).
Since -Dimporttsv.skip.bad.lines=false is set, any empty lines will cause this error.
Regards
KASHIF
-----Original Message-----
From: ashish singhi [mailto:ashish.singhi@huawei.com]
Sent: 07 April 2014 13:56
To: user@hbase.apache.org
Cc: dev@hbase.apache.org
Subject: One question regarding bulk load
Hi all.
I have one question regarding bulk load.
How to load data with table empty column values in few rows using bulk load tool ?
I tried the following simple example in HBase 0.94.11 and Hadoop-2, with table having three columns and second column value is empty in few rows using bulk load tool.
Ø Data in file is in below format
row0,value1,value0
row1,,value1
row2,value3,value2
row3,,value3
row4,value5,value4
row5,,value5
row6,value7,value6
row7,,value7
row8,value9,value8
Ø When I execute the command
hadoop jar <HBASE_HOME>/hbase-0.94.11-security.jar importtsv -Dimporttsv.skip.bad.lines=false -Dimporttsv.separator=, -Dimporttsv.columns=HBASE_ROW_KEY,cf1:c1,cf1:c2 -Dimporttsv.bulk.output= /bulkdata/comma_separated _3columns comma_separated_3columns /comma_separated_ 3columns.txt
I get the below Exception.
2014-04-07 11:15:01,870 INFO [main] mapreduce.Job (Job.java:printTaskEvents(1424)) - Task Id : attempt_1396526639698_0028_m_000000_2, Status : FAILED
Error: java.io.IOException: org.apache.hadoop.hbase.mapreduce.ImportTsv$TsvParser$BadTsvLineException: No delimiter
at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:135)
at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:33)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
Regards,
Ashish Singhi
RE: One question regarding bulk load
Posted by Kashif Jawed Siddiqui <ka...@huawei.com>.
Hi,
Please check if your file contains empty lines(maybe in the beginning or the end).
Since -Dimporttsv.skip.bad.lines=false is set, any empty lines will cause this error.
Regards
KASHIF
-----Original Message-----
From: ashish singhi [mailto:ashish.singhi@huawei.com]
Sent: 07 April 2014 13:56
To: user@hbase.apache.org
Cc: dev@hbase.apache.org
Subject: One question regarding bulk load
Hi all.
I have one question regarding bulk load.
How to load data with table empty column values in few rows using bulk load tool ?
I tried the following simple example in HBase 0.94.11 and Hadoop-2, with table having three columns and second column value is empty in few rows using bulk load tool.
Ø Data in file is in below format
row0,value1,value0
row1,,value1
row2,value3,value2
row3,,value3
row4,value5,value4
row5,,value5
row6,value7,value6
row7,,value7
row8,value9,value8
Ø When I execute the command
hadoop jar <HBASE_HOME>/hbase-0.94.11-security.jar importtsv -Dimporttsv.skip.bad.lines=false -Dimporttsv.separator=, -Dimporttsv.columns=HBASE_ROW_KEY,cf1:c1,cf1:c2 -Dimporttsv.bulk.output= /bulkdata/comma_separated _3columns comma_separated_3columns /comma_separated_ 3columns.txt
I get the below Exception.
2014-04-07 11:15:01,870 INFO [main] mapreduce.Job (Job.java:printTaskEvents(1424)) - Task Id : attempt_1396526639698_0028_m_000000_2, Status : FAILED
Error: java.io.IOException: org.apache.hadoop.hbase.mapreduce.ImportTsv$TsvParser$BadTsvLineException: No delimiter
at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:135)
at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:33)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
Regards,
Ashish Singhi
RE: One question regarding bulk load
Posted by Kashif Jawed Siddiqui <ka...@huawei.com>.
Hi,
Please check if your file contains empty lines(maybe in the beginning or the end).
Since -Dimporttsv.skip.bad.lines=false is set, any empty lines will cause this error.
Regards
KASHIF
-----Original Message-----
From: ashish singhi [mailto:ashish.singhi@huawei.com]
Sent: 07 April 2014 13:56
To: user@hbase.apache.org
Cc: dev@hbase.apache.org
Subject: One question regarding bulk load
Hi all.
I have one question regarding bulk load.
How to load data with table empty column values in few rows using bulk load tool ?
I tried the following simple example in HBase 0.94.11 and Hadoop-2, with table having three columns and second column value is empty in few rows using bulk load tool.
Ø Data in file is in below format
row0,value1,value0
row1,,value1
row2,value3,value2
row3,,value3
row4,value5,value4
row5,,value5
row6,value7,value6
row7,,value7
row8,value9,value8
Ø When I execute the command
hadoop jar <HBASE_HOME>/hbase-0.94.11-security.jar importtsv -Dimporttsv.skip.bad.lines=false -Dimporttsv.separator=, -Dimporttsv.columns=HBASE_ROW_KEY,cf1:c1,cf1:c2 -Dimporttsv.bulk.output= /bulkdata/comma_separated _3columns comma_separated_3columns /comma_separated_ 3columns.txt
I get the below Exception.
2014-04-07 11:15:01,870 INFO [main] mapreduce.Job (Job.java:printTaskEvents(1424)) - Task Id : attempt_1396526639698_0028_m_000000_2, Status : FAILED
Error: java.io.IOException: org.apache.hadoop.hbase.mapreduce.ImportTsv$TsvParser$BadTsvLineException: No delimiter
at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:135)
at org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper.map(TsvImporterTextMapper.java:33)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
Regards,
Ashish Singhi