You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by Ankur Shanbhag <an...@persistent.co.in> on 2014/02/21 14:36:18 UTC

Support for multiple partitions on HDFS using single Sqoop import

Hello,

We can import data into a partition using Sqoop hive-import by specifying values for --hive-partition-key and --hive-partition-value. But, at a time only one partition is created using Sqoop import command.

Is there any way where multiple partition values Or range of values can be specified in one Sqoop Job?

Sample command to import data into partition using Sqoop:-
sqoop import --connect jdbc:oracle:thin:@//ps8606:1521/ORCL --query "select  ROLL_NO,NAME,DOB from QAUSER.STUDENT_DEMO where DOB ='1991-08-21' and \$CONDITIONS " --target-dir "QAUSER.STUDENT_DEMO" --hive-import --hive-overwrite --hive-table "QAUSER.STUDENT" --hive-partition-key "DOB" --hive-partition-value "1991-08-21" --split-by ROLL_NO --username QAUSER --password qauser

The above command will create a partition that contains all records of students whose DOB is 21-Aug-1991. For this partition, a sub-directory with name as "DOB=1991-08-21" gets created inside student directory.

For different 'DOB' values, can above Sqoop command be modified to create multiple partitions on HDFS?

Thanks and Regards,
Ankur Shanbhag

DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.


Re: Support for multiple partitions on HDFS using single Sqoop import

Posted by Venkat <ve...@gmail.com>.
Please see the sqiio hcatalog support which allows dynamic partititions.
It allows a DB field to be used as the partition key value (and allows
multiple level of partition keys also)

Venkat


On Fri, Feb 21, 2014 at 5:36 AM, Ankur Shanbhag <
ankur_shanbhag@persistent.co.in> wrote:

> Hello,
>
> We can import data into a partition using Sqoop hive-import by specifying
> values for --hive-partition-key and --hive-partition-value. But, at a time
> only one partition is created using Sqoop import command.
>
> Is there any way where multiple partition values Or range of values can be
> specified in one Sqoop Job?
>
> Sample command to import data into partition using Sqoop:-
> sqoop import --connect jdbc:oracle:thin:@//ps8606:1521/ORCL --query
> "select  ROLL_NO,NAME,DOB from QAUSER.STUDENT_DEMO where DOB ='1991-08-21'
> and \$CONDITIONS " --target-dir "QAUSER.STUDENT_DEMO" --hive-import
> --hive-overwrite --hive-table "QAUSER.STUDENT" --hive-partition-key "DOB"
> --hive-partition-value "1991-08-21" --split-by ROLL_NO --username QAUSER
> --password qauser
>
> The above command will create a partition that contains all records of
> students whose DOB is 21-Aug-1991. For this partition, a sub-directory with
> name as "DOB=1991-08-21" gets created inside student directory.
>
> For different 'DOB' values, can above Sqoop command be modified to create
> multiple partitions on HDFS?
>
> Thanks and Regards,
> Ankur Shanbhag
>
> DISCLAIMER
> ==========
> This e-mail may contain privileged and confidential information which is
> the property of Persistent Systems Ltd. It is intended only for the use of
> the individual or entity to which it is addressed. If you are not the
> intended recipient, you are not authorized to read, retain, copy, print,
> distribute or use this message. If you have received this communication in
> error, please notify the sender and delete all copies of this message.
> Persistent Systems Ltd. does not accept any liability for virus infected
> mails.
>
>


-- 
Regards

Venkat