You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by ch huang <ju...@gmail.com> on 2014/07/22 10:01:57 UTC

issue about testing importtsv with other field separator

hi,maillist:

i test hbase 0.96.1.1 importtsv tool ,find it do not work with non tab
field separator

# sudo -u hdfs hbase org.apache.hadoop.hbase.mapreduce.ImportTsv
-Dimporttsv.columns=HBASE_ROW_KEY,myco1,mycol2

"-Dmporttsv.separator=|" alex:mymy2 /tmp/alex_test

2014-07-22 15:55:59,746 INFO  [main] mapreduce.Job: Counters: 31
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=113939
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=106
                HDFS: Number of bytes written=0
                HDFS: Number of read operations=2
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=0
        Job Counters
                Launched map tasks=1
                Data-local map tasks=1
                Total time spent by all maps in occupied slots (ms)=2160
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=2160
                Total vcore-seconds taken by all map tasks=2160
                Total megabyte-seconds taken by all map tasks=4423680
        Map-Reduce Framework
                Map input records=1
                Map output records=0
                Input split bytes=96
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=18
                CPU time spent (ms)=190
                Physical memory (bytes) snapshot=278962176
                Virtual memory (bytes) snapshot=2180153344
                Total committed heap usage (bytes)=502267904
        ImportTsv
                Bad Lines=1                        #  why bad lines?
        File Input Format Counters
                Bytes Read=10
        File Output Format Counters
                Bytes Written=0

hbase(main):015:0> scan 'alex:mymy2'
ROW                                           COLUMN+CELL
0 row(s) in 0.0030 seconds

# hadoop fs -cat /tmp/alex_test
aa|bb|dd

Re: issue about testing importtsv with other field separator

Posted by Esteban Gutierrez <es...@cloudera.com>.

Hello,

Seems that you have a typo in the command line: -Dmporttsv.separator it
should be -Dimporttsv.separator

cheers,
esteban.



--
Cloudera, Inc.



On Tue, Jul 22, 2014 at 1:01 AM, ch huang <ju...@gmail.com> wrote:

> hi,maillist:
>
> i test hbase 0.96.1.1 importtsv tool ,find it do not work with non tab
> field separator
>
> # sudo -u hdfs hbase org.apache.hadoop.hbase.mapreduce.ImportTsv
> -Dimporttsv.columns=HBASE_ROW_KEY,myco1,mycol2
>
> "-Dmporttsv.separator=|" alex:mymy2 /tmp/alex_test
>
> 2014-07-22 15:55:59,746 INFO  [main] mapreduce.Job: Counters: 31
>         File System Counters
>                 FILE: Number of bytes read=0
>                 FILE: Number of bytes written=113939
>                 FILE: Number of read operations=0
>                 FILE: Number of large read operations=0
>                 FILE: Number of write operations=0
>                 HDFS: Number of bytes read=106
>                 HDFS: Number of bytes written=0
>                 HDFS: Number of read operations=2
>                 HDFS: Number of large read operations=0
>                 HDFS: Number of write operations=0
>         Job Counters
>                 Launched map tasks=1
>                 Data-local map tasks=1
>                 Total time spent by all maps in occupied slots (ms)=2160
>                 Total time spent by all reduces in occupied slots (ms)=0
>                 Total time spent by all map tasks (ms)=2160
>                 Total vcore-seconds taken by all map tasks=2160
>                 Total megabyte-seconds taken by all map tasks=4423680
>         Map-Reduce Framework
>                 Map input records=1
>                 Map output records=0
>                 Input split bytes=96
>                 Spilled Records=0
>                 Failed Shuffles=0
>                 Merged Map outputs=0
>                 GC time elapsed (ms)=18
>                 CPU time spent (ms)=190
>                 Physical memory (bytes) snapshot=278962176
>                 Virtual memory (bytes) snapshot=2180153344
>                 Total committed heap usage (bytes)=502267904
>         ImportTsv
>                 Bad Lines=1                        #  why bad lines?
>         File Input Format Counters
>                 Bytes Read=10
>         File Output Format Counters
>                 Bytes Written=0
>
> hbase(main):015:0> scan 'alex:mymy2'
> ROW                                           COLUMN+CELL
> 0 row(s) in 0.0030 seconds
>
> # hadoop fs -cat /tmp/alex_test
> aa|bb|dd
>