You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "David Randolph (JIRA)" <ji...@apache.org> on 2012/05/15 23:27:07 UTC
[jira] [Comment Edited] (SQOOP-485) --hive-import cannot be used
with --query
[ https://issues.apache.org/jira/browse/SQOOP-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276228#comment-13276228 ]
David Randolph edited comment on SQOOP-485 at 5/15/12 9:25 PM:
---------------------------------------------------------------
If sqoop supports a direct import to Hive, I think the user should be able to specify an SQL query to define what data is imported. You can call this a missing feature instead of a bug if you prefer, but it seems like core functionality to me.
Your suggested workaround would not provide me with a functioning Hive warehouse. The complete workaround, I think, looks something like this:
# sqoop create-hive-table --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --table FACT --username HADOOP --password $pass
# sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password $pass --query 'select * from fact where day = 5 and $CONDITIONS' --fields-terminated-by , --escaped-by \\ -m1 --target-dir /user/me/fact --append
# hadoop fs -mv /user/me/fact/part-m-00000 /user/me/fact/day5
# hive -e "load data inpath '/user/me/fact/day5' into table fact"
This will move the files from the target-dir to /user/hive/warehouse/fact, and these data will be usable in hive.
Thanks,
Dave
was (Author: rndlph):
If sqoop supports a direct import to Hive, I think you should be able to specify an SQL query to define what data is imported. You can call this a missing feature instead of a bug if you like, but it seems like core functionality to me.
Your suggested workaround would not provide me with a functioning Hive warehouse. The complete workaround, I think, looks something like this:
# sqoop create-hive-table --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --table FACT --username HADOOP --password $pass
# sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password $pass --query 'select * from fact where day = 5 and $CONDITIONS' --fields-terminated-by , --escaped-by \\ -m1 --target-dir /user/me/fact --append
# hadoop fs -mv /user/me/fact/part-m-00000 /user/me/fact/day5
# hive -e "load data inpath '/user/me/fact/day5' into table fact"
This will move the files from the target-dir to /user/hive/warehouse/fact, and these data will be usable in hive.
Thanks,
Dave
> --hive-import cannot be used with --query
> -----------------------------------------
>
> Key: SQOOP-485
> URL: https://issues.apache.org/jira/browse/SQOOP-485
> Project: Sqoop
> Issue Type: Bug
> Components: hive-integration
> Affects Versions: 1.3.0
> Environment: :/home/w17626> uname -a
> Linux il93rhel91 2.6.18-128.2.1.el5 #1 SMP Wed Jul 8 11:54:47 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
> Reporter: David Randolph
>
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> produces the following output:
> {noformat}
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> 12/05/15 11:10:10 DEBUG tool.BaseSqoopTool: Enabled debug logging.
> 12/05/15 11:10:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
> 12/05/15 11:10:10 DEBUG sqoop.Sqoop: Must specify destination with --target-dir.
> Try --help for usage instructions.
> Must specify destination with --target-dir.
> Try --help for usage instructions.
> at com.cloudera.sqoop.tool.ImportTool.validateImportOptions(ImportTool.java:823)
> at com.cloudera.sqoop.tool.ImportTool.validateOptions(ImportTool.java:890)
> at com.cloudera.sqoop.Sqoop.run(Sqoop.java:132)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
> at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
> at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
> at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> Must specify destination with --target-dir.
> Try --help for usage instructions.
> <snip>
> {noformat}
> I can work around this immediate failure by specifying the --target-dir flag. This seems to do the right thing, but the sqoop command hangs. Also, use of --target-dir with --hive-import is discouraged in SQOOP-464.
> How can I use the --query flag to import into Hive?
> :> sqoop version
> Sqoop 1.3.0-cdh3u3
> git commit id 57cbc8d38cc5ff22a24d34c3d13f9862fd1372bb
> Compiled by jenkins@ubuntu-slave02 on Thu Jan 26 10:41:26 PST 2012
> Thanks,
> Dave
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira