You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@sqoop.apache.org by ar...@apache.org on 2011/10/03 22:55:13 UTC
svn commit: r1178574 - in /incubator/sqoop/trunk/src/docs/user: basics.txt import-purpose.txt import.txt saved-jobs.txt

Author: arvind
Date: Mon Oct  3 20:55:13 2011
New Revision: 1178574

URL: http://svn.apache.org/viewvc?rev=1178574&view=rev
Log:
SQOOP-355. Improve Sqoop Documentation for Avro data file support.

(Doug Cutting via Arvind Prabhakar)

Modified:
    incubator/sqoop/trunk/src/docs/user/basics.txt
    incubator/sqoop/trunk/src/docs/user/import-purpose.txt
    incubator/sqoop/trunk/src/docs/user/import.txt
    incubator/sqoop/trunk/src/docs/user/saved-jobs.txt

Modified: incubator/sqoop/trunk/src/docs/user/basics.txt
URL: http://svn.apache.org/viewvc/incubator/sqoop/trunk/src/docs/user/basics.txt?rev=1178574&r1=1178573&r2=1178574&view=diff
==============================================================================
--- incubator/sqoop/trunk/src/docs/user/basics.txt (original)
+++ incubator/sqoop/trunk/src/docs/user/basics.txt Mon Oct  3 20:55:13 2011
@@ -29,7 +29,7 @@ process is a set of files containing a c
 The import process is performed in parallel. For this reason, the
 output will be in multiple files. These files may be delimited text
 files (for example, with commas or tabs separating each field), or
-binary SequenceFiles containing serialized record data.
+binary Avro or SequenceFiles containing serialized record data.
 
 A by-product of the import process is a generated Java class which
 can encapsulate one row of the imported table. This class is used

Modified: incubator/sqoop/trunk/src/docs/user/import-purpose.txt
URL: http://svn.apache.org/viewvc/incubator/sqoop/trunk/src/docs/user/import-purpose.txt?rev=1178574&r1=1178573&r2=1178574&view=diff
==============================================================================
--- incubator/sqoop/trunk/src/docs/user/import-purpose.txt (original)
+++ incubator/sqoop/trunk/src/docs/user/import-purpose.txt Mon Oct  3 20:55:13 2011
@@ -22,5 +22,5 @@
 The +import+ tool imports an individual table from an RDBMS to HDFS.
 Each row from a table is represented as a separate record in HDFS.
 Records can be stored as text files (one record per line), or in
-binary representation in SequenceFiles.
+binary representation as Avro or SequenceFiles.
 

Modified: incubator/sqoop/trunk/src/docs/user/import.txt
URL: http://svn.apache.org/viewvc/incubator/sqoop/trunk/src/docs/user/import.txt?rev=1178574&r1=1178573&r2=1178574&view=diff
==============================================================================
--- incubator/sqoop/trunk/src/docs/user/import.txt (original)
+++ incubator/sqoop/trunk/src/docs/user/import.txt Mon Oct  3 20:55:13 2011
@@ -344,11 +344,17 @@ manipulated by custom MapReduce programs
 is higher-performance than reading from text files, as records do not
 need to be parsed).
 
-By default, data is not compressed. You can compress
-your data by using the deflate (gzip) algorithm with the +-z+ or
-+\--compress+ argument, or specify any Hadoop compression codec using the
-+\--compression-codec+ argument. This applies to both SequenceFiles or text
-files.
+Avro data files are a compact, efficient binary format that provides
+interoperability with applications written in other programming
+languages.  Avro also supports versioning, so that when, e.g., columns
+are added or removed from a table, previously imported data files can
+be processed along with new ones.
+
+By default, data is not compressed. You can compress your data by
+using the deflate (gzip) algorithm with the +-z+ or +\--compress+
+argument, or specify any Hadoop compression codec using the
++\--compression-codec+ argument. This applies to SequenceFile, text,
+and Avro files.
 
 Large Objects
 ^^^^^^^^^^^^^

Modified: incubator/sqoop/trunk/src/docs/user/saved-jobs.txt
URL: http://svn.apache.org/viewvc/incubator/sqoop/trunk/src/docs/user/saved-jobs.txt?rev=1178574&r1=1178573&r2=1178574&view=diff
==============================================================================
--- incubator/sqoop/trunk/src/docs/user/saved-jobs.txt (original)
+++ incubator/sqoop/trunk/src/docs/user/saved-jobs.txt Mon Oct  3 20:55:13 2011
@@ -304,8 +304,8 @@ This would run a MapReduce job where the
 of each row is used to join rows; rows in the +newer+ dataset will
 be used in preference to rows in the +older+ dataset.
 
-This can be used with both SequenceFile- and text-based incremental
-imports. The file types of the newer and older datasets must be the
-same.
+This can be used with both SequenceFile-, Avro- and text-based
+incremental imports. The file types of the newer and older datasets
+must be the same.