You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by st...@apache.org on 2011/05/19 20:37:11 UTC
svn commit: r1125046 - in /hbase/trunk/src:
main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java
site/xdoc/bulk-loads.xml
Author: stack
Date: Thu May 19 18:37:10 2011
New Revision: 1125046
URL: http://svn.apache.org/viewvc?rev=1125046&view=rev
Log:
HBASE-3901 Update documentation for ImportTsv to reflect recent features
Modified:
hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java
hbase/trunk/src/site/xdoc/bulk-loads.xml
Modified: hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java
URL: http://svn.apache.org/viewvc/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java?rev=1125046&r1=1125045&r2=1125046&view=diff
==============================================================================
--- hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java (original)
+++ hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java Thu May 19 18:37:10 2011
@@ -256,7 +256,8 @@ public class ImportTsv {
"as the row key for each imported record. You must specify exactly one column\n" +
"to be the row key.\n" +
"\n" +
- "In order to prepare data for a bulk data load, pass the option:\n" +
+ "By default importtsv will load data directly into HBase. To instead generate\n" +
+ "HFiles of data to prepare for a bulk data load, pass the option:\n" +
" -D" + BULK_OUTPUT_CONF_KEY + "=/path/for/output\n" +
"\n" +
"Other options that may be specified with -D include:\n" +
Modified: hbase/trunk/src/site/xdoc/bulk-loads.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/site/xdoc/bulk-loads.xml?rev=1125046&r1=1125045&r2=1125046&view=diff
==============================================================================
--- hbase/trunk/src/site/xdoc/bulk-loads.xml (original)
+++ hbase/trunk/src/site/xdoc/bulk-loads.xml Thu May 19 18:37:10 2011
@@ -100,12 +100,17 @@ column name HBASE_ROW_KEY is used to des
as the row key for each imported record. You must specify exactly one column
to be the row key.
-In order to prepare data for a bulk data load, pass the option:
+By default importtsv will load data directly into HBase. To instead generate
+HFiles of data to prepare for a bulk data load, pass the option:
-Dimporttsv.bulk.output=/path/for/output
Other options that may be specified with -D include:
-Dimporttsv.skip.bad.lines=false - fail if encountering an invalid line
-Dimporttsv.timestamp=currentTimeAsLong - use the specified timestamp for the import
+ '-Dimporttsv.separator=|' - eg separate on pipes instead of tabs
+ -Dimporttsv.timestamp=currentTimeAsLong - use the specified timestamp for the import
+ -Dimporttsv.mapper.class=my.Mapper - A user-defined Mapper to use instead of TsvImporterMapper
+
</pre></code>
</section>
<section name="Importing the prepared data using the completebulkload tool">