You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "David Randolph (JIRA)" <ji...@apache.org> on 2012/05/15 20:09:17 UTC

[jira] [Created] (SQOOP-485) --hive-import cannot be used with --query

David Randolph created SQOOP-485:
------------------------------------

             Summary: --hive-import cannot be used with --query
                 Key: SQOOP-485
                 URL: https://issues.apache.org/jira/browse/SQOOP-485
             Project: Sqoop
          Issue Type: Bug
          Components: hive-integration
    Affects Versions: 1.3.0
         Environment: :/home/w17626> uname -a
Linux il93rhel91 2.6.18-128.2.1.el5 #1 SMP Wed Jul 8 11:54:47 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux

            Reporter: David Randolph


sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append

produces the following output:
{noformat}
sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
12/05/15 11:10:10 DEBUG tool.BaseSqoopTool: Enabled debug logging.
12/05/15 11:10:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
12/05/15 11:10:10 DEBUG sqoop.Sqoop: Must specify destination with --target-dir.
Try --help for usage instructions.
Must specify destination with --target-dir.
Try --help for usage instructions.
        at com.cloudera.sqoop.tool.ImportTool.validateImportOptions(ImportTool.java:823)
        at com.cloudera.sqoop.tool.ImportTool.validateOptions(ImportTool.java:890)
        at com.cloudera.sqoop.Sqoop.run(Sqoop.java:132)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
        at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
        at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
        at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
Must specify destination with --target-dir.
Try --help for usage instructions.
<snip>
{noformat}

I can work around this immediate failure by specifying the --target-dir flag. This seems to do the right thing, but the sqoop command hangs. Also, use of --target-dir with --hive-import is discouraged in SQOOP-464.

How can I use the --query flag to import into Hive?

:> sqoop version
Sqoop 1.3.0-cdh3u3
git commit id 57cbc8d38cc5ff22a24d34c3d13f9862fd1372bb
Compiled by jenkins@ubuntu-slave02 on Thu Jan 26 10:41:26 PST 2012


Thanks,
Dave

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Comment Edited] (SQOOP-485) --hive-import cannot be used with --query

Posted by "David Randolph (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SQOOP-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276228#comment-13276228 ] 

David Randolph edited comment on SQOOP-485 at 5/15/12 9:26 PM:
---------------------------------------------------------------

If sqoop supports a direct import to Hive, I think the user should be able to specify an SQL query to define what data are imported. You can call this a missing feature instead of a bug if you prefer, but it seems like core functionality to me.

Your suggested workaround would not provide me with a functioning Hive warehouse. The complete workaround, I think, looks something like this:

# sqoop create-hive-table --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --table FACT --username HADOOP --password $pass
# sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password $pass --query 'select * from fact where day = 5 and $CONDITIONS' --fields-terminated-by , --escaped-by \\ -m1 --target-dir /user/me/fact --append
# hadoop fs -mv /user/me/fact/part-m-00000 /user/me/fact/day5
# hive -e "load data inpath '/user/me/fact/day5' into table fact"

This will move the files from the target-dir to /user/hive/warehouse/fact, and these data will be usable in hive.

Thanks,
Dave
                
      was (Author: rndlph):
    If sqoop supports a direct import to Hive, I think the user should be able to specify an SQL query to define what data is imported. You can call this a missing feature instead of a bug if you prefer, but it seems like core functionality to me.

Your suggested workaround would not provide me with a functioning Hive warehouse. The complete workaround, I think, looks something like this:

# sqoop create-hive-table --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --table FACT --username HADOOP --password $pass
# sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password $pass --query 'select * from fact where day = 5 and $CONDITIONS' --fields-terminated-by , --escaped-by \\ -m1 --target-dir /user/me/fact --append
# hadoop fs -mv /user/me/fact/part-m-00000 /user/me/fact/day5
# hive -e "load data inpath '/user/me/fact/day5' into table fact"

This will move the files from the target-dir to /user/hive/warehouse/fact, and these data will be usable in hive.

Thanks,
Dave
                  
> --hive-import cannot be used with --query
> -----------------------------------------
>
>                 Key: SQOOP-485
>                 URL: https://issues.apache.org/jira/browse/SQOOP-485
>             Project: Sqoop
>          Issue Type: Bug
>          Components: hive-integration
>    Affects Versions: 1.3.0
>         Environment: :/home/w17626> uname -a
> Linux il93rhel91 2.6.18-128.2.1.el5 #1 SMP Wed Jul 8 11:54:47 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: David Randolph
>
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> produces the following output:
> {noformat}
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> 12/05/15 11:10:10 DEBUG tool.BaseSqoopTool: Enabled debug logging.
> 12/05/15 11:10:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
> 12/05/15 11:10:10 DEBUG sqoop.Sqoop: Must specify destination with --target-dir.
> Try --help for usage instructions.
> Must specify destination with --target-dir.
> Try --help for usage instructions.
>         at com.cloudera.sqoop.tool.ImportTool.validateImportOptions(ImportTool.java:823)
>         at com.cloudera.sqoop.tool.ImportTool.validateOptions(ImportTool.java:890)
>         at com.cloudera.sqoop.Sqoop.run(Sqoop.java:132)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> Must specify destination with --target-dir.
> Try --help for usage instructions.
> <snip>
> {noformat}
> I can work around this immediate failure by specifying the --target-dir flag. This seems to do the right thing, but the sqoop command hangs. Also, use of --target-dir with --hive-import is discouraged in SQOOP-464.
> How can I use the --query flag to import into Hive?
> :> sqoop version
> Sqoop 1.3.0-cdh3u3
> git commit id 57cbc8d38cc5ff22a24d34c3d13f9862fd1372bb
> Compiled by jenkins@ubuntu-slave02 on Thu Jan 26 10:41:26 PST 2012
> Thanks,
> Dave

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (SQOOP-485) --hive-import cannot be used with --query

Posted by "David Randolph (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SQOOP-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276228#comment-13276228 ] 

David Randolph commented on SQOOP-485:
--------------------------------------

If sqoop supports a direct import to Hive, I think you should be able to specify an SQL query to define what data is imported. You can call this a missing feature instead of a bug if you like, but it seems like core functionality to me.

Your suggested workaround would not provide me with a functioning Hive warehouse. The complete workaround, I think, looks something like this:

# sqoop create-hive-table --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --table FACT --username HADOOP --password $pass
# sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password $pass --query 'select * from fact where day = 5 and $CONDITIONS' --fields-terminated-by , --escaped-by \\ -m1 --target-dir /user/me/fact --append
# hadoop fs -mv /user/me/fact/part-m-00000 /user/me/fact/day5
# hive -e "load data inpath '/user/me/fact/day5' into table fact"

This will move the files from the target-dir to /user/hive/warehouse/fact, and these data will be usable in hive.

Thanks,
Dave
                
> --hive-import cannot be used with --query
> -----------------------------------------
>
>                 Key: SQOOP-485
>                 URL: https://issues.apache.org/jira/browse/SQOOP-485
>             Project: Sqoop
>          Issue Type: Bug
>          Components: hive-integration
>    Affects Versions: 1.3.0
>         Environment: :/home/w17626> uname -a
> Linux il93rhel91 2.6.18-128.2.1.el5 #1 SMP Wed Jul 8 11:54:47 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: David Randolph
>
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> produces the following output:
> {noformat}
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> 12/05/15 11:10:10 DEBUG tool.BaseSqoopTool: Enabled debug logging.
> 12/05/15 11:10:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
> 12/05/15 11:10:10 DEBUG sqoop.Sqoop: Must specify destination with --target-dir.
> Try --help for usage instructions.
> Must specify destination with --target-dir.
> Try --help for usage instructions.
>         at com.cloudera.sqoop.tool.ImportTool.validateImportOptions(ImportTool.java:823)
>         at com.cloudera.sqoop.tool.ImportTool.validateOptions(ImportTool.java:890)
>         at com.cloudera.sqoop.Sqoop.run(Sqoop.java:132)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> Must specify destination with --target-dir.
> Try --help for usage instructions.
> <snip>
> {noformat}
> I can work around this immediate failure by specifying the --target-dir flag. This seems to do the right thing, but the sqoop command hangs. Also, use of --target-dir with --hive-import is discouraged in SQOOP-464.
> How can I use the --query flag to import into Hive?
> :> sqoop version
> Sqoop 1.3.0-cdh3u3
> git commit id 57cbc8d38cc5ff22a24d34c3d13f9862fd1372bb
> Compiled by jenkins@ubuntu-slave02 on Thu Jan 26 10:41:26 PST 2012
> Thanks,
> Dave

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Comment Edited] (SQOOP-485) --hive-import cannot be used with --query

Posted by "Jarek Jarcec Cecho (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SQOOP-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279703#comment-13279703 ] 

Jarek Jarcec Cecho edited comment on SQOOP-485 at 5/20/12 9:02 AM:
-------------------------------------------------------------------

Hi Dave,
I'm using sqoop to do hive imports (both full and incremental) on daily basis, so I believe that this functionality is working properly :-) I just tried to express that the usage is little bit different.

I do have couple of notes to your extended variant of my workaround.

1. This command will create table's metadata in hive only once on first execution. Any metadata changes to table on Oracle side (adding column, removing column, ...) won't be reflected (updated) in hive on any of the subsequent calls.
2. Running sqoop with --append won't give you any extended functionality here as it appears that you're importing into empty directory. Parameter --append make sense only in case that you're doing sort of incremental import do directory that already contains some files.
3. -
4. As far as I know, hive do not allow you to execute this command if the table fact already contains any data. You need to use "OVERRIDE" keyword, but then you'll loose previous content of the table. I believe that this is not desired, right?

Jarcec

                
      was (Author: jarcec):
    Hi Dave,
I'm using sqoop to do hive imports (both full and incremental) on daily basis, so I believe that this functionality is working properly :-) I just tried to express that the usage is little bit different.

I do have couple of notes to your extended variant of my workaround.

1. This command will create table's metadata in hive only once on first execution. Any metadata changes to table on Oracle side (adding column, removing column, ...) won't be reflected (updated) in hive on any of the subsequent calls.
2. Running sqoop with --append won't give you any extended functionality here as it appears that you're importing into empty directory. Parameter --append make sense only in case that you're doing sort of incremental import do directory that already contains some files.
3. -
4. As far as I know, hive do not allow you to execute this command if the table fact already contains any data. You need to use "OVERRIDE" keyword, but then you'll loose previous content of the table. I believe that this is not desired, right?


                  
> --hive-import cannot be used with --query
> -----------------------------------------
>
>                 Key: SQOOP-485
>                 URL: https://issues.apache.org/jira/browse/SQOOP-485
>             Project: Sqoop
>          Issue Type: Bug
>          Components: hive-integration
>    Affects Versions: 1.3.0
>         Environment: :/home/w17626> uname -a
> Linux il93rhel91 2.6.18-128.2.1.el5 #1 SMP Wed Jul 8 11:54:47 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: David Randolph
>
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> produces the following output:
> {noformat}
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> 12/05/15 11:10:10 DEBUG tool.BaseSqoopTool: Enabled debug logging.
> 12/05/15 11:10:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
> 12/05/15 11:10:10 DEBUG sqoop.Sqoop: Must specify destination with --target-dir.
> Try --help for usage instructions.
> Must specify destination with --target-dir.
> Try --help for usage instructions.
>         at com.cloudera.sqoop.tool.ImportTool.validateImportOptions(ImportTool.java:823)
>         at com.cloudera.sqoop.tool.ImportTool.validateOptions(ImportTool.java:890)
>         at com.cloudera.sqoop.Sqoop.run(Sqoop.java:132)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> Must specify destination with --target-dir.
> Try --help for usage instructions.
> <snip>
> {noformat}
> I can work around this immediate failure by specifying the --target-dir flag. This seems to do the right thing, but the sqoop command hangs. Also, use of --target-dir with --hive-import is discouraged in SQOOP-464.
> How can I use the --query flag to import into Hive?
> :> sqoop version
> Sqoop 1.3.0-cdh3u3
> git commit id 57cbc8d38cc5ff22a24d34c3d13f9862fd1372bb
> Compiled by jenkins@ubuntu-slave02 on Thu Jan 26 10:41:26 PST 2012
> Thanks,
> Dave

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Comment Edited] (SQOOP-485) --hive-import cannot be used with --query

Posted by "David Randolph (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SQOOP-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276228#comment-13276228 ] 

David Randolph edited comment on SQOOP-485 at 5/15/12 9:25 PM:
---------------------------------------------------------------

If sqoop supports a direct import to Hive, I think the user should be able to specify an SQL query to define what data is imported. You can call this a missing feature instead of a bug if you prefer, but it seems like core functionality to me.

Your suggested workaround would not provide me with a functioning Hive warehouse. The complete workaround, I think, looks something like this:

# sqoop create-hive-table --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --table FACT --username HADOOP --password $pass
# sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password $pass --query 'select * from fact where day = 5 and $CONDITIONS' --fields-terminated-by , --escaped-by \\ -m1 --target-dir /user/me/fact --append
# hadoop fs -mv /user/me/fact/part-m-00000 /user/me/fact/day5
# hive -e "load data inpath '/user/me/fact/day5' into table fact"

This will move the files from the target-dir to /user/hive/warehouse/fact, and these data will be usable in hive.

Thanks,
Dave
                
      was (Author: rndlph):
    If sqoop supports a direct import to Hive, I think you should be able to specify an SQL query to define what data is imported. You can call this a missing feature instead of a bug if you like, but it seems like core functionality to me.

Your suggested workaround would not provide me with a functioning Hive warehouse. The complete workaround, I think, looks something like this:

# sqoop create-hive-table --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --table FACT --username HADOOP --password $pass
# sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password $pass --query 'select * from fact where day = 5 and $CONDITIONS' --fields-terminated-by , --escaped-by \\ -m1 --target-dir /user/me/fact --append
# hadoop fs -mv /user/me/fact/part-m-00000 /user/me/fact/day5
# hive -e "load data inpath '/user/me/fact/day5' into table fact"

This will move the files from the target-dir to /user/hive/warehouse/fact, and these data will be usable in hive.

Thanks,
Dave
                  
> --hive-import cannot be used with --query
> -----------------------------------------
>
>                 Key: SQOOP-485
>                 URL: https://issues.apache.org/jira/browse/SQOOP-485
>             Project: Sqoop
>          Issue Type: Bug
>          Components: hive-integration
>    Affects Versions: 1.3.0
>         Environment: :/home/w17626> uname -a
> Linux il93rhel91 2.6.18-128.2.1.el5 #1 SMP Wed Jul 8 11:54:47 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: David Randolph
>
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> produces the following output:
> {noformat}
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> 12/05/15 11:10:10 DEBUG tool.BaseSqoopTool: Enabled debug logging.
> 12/05/15 11:10:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
> 12/05/15 11:10:10 DEBUG sqoop.Sqoop: Must specify destination with --target-dir.
> Try --help for usage instructions.
> Must specify destination with --target-dir.
> Try --help for usage instructions.
>         at com.cloudera.sqoop.tool.ImportTool.validateImportOptions(ImportTool.java:823)
>         at com.cloudera.sqoop.tool.ImportTool.validateOptions(ImportTool.java:890)
>         at com.cloudera.sqoop.Sqoop.run(Sqoop.java:132)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> Must specify destination with --target-dir.
> Try --help for usage instructions.
> <snip>
> {noformat}
> I can work around this immediate failure by specifying the --target-dir flag. This seems to do the right thing, but the sqoop command hangs. Also, use of --target-dir with --hive-import is discouraged in SQOOP-464.
> How can I use the --query flag to import into Hive?
> :> sqoop version
> Sqoop 1.3.0-cdh3u3
> git commit id 57cbc8d38cc5ff22a24d34c3d13f9862fd1372bb
> Compiled by jenkins@ubuntu-slave02 on Thu Jan 26 10:41:26 PST 2012
> Thanks,
> Dave

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (SQOOP-485) --hive-import cannot be used with --query

Posted by "Jarek Jarcec Cecho (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SQOOP-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276207#comment-13276207 ] 

Jarek Jarcec Cecho commented on SQOOP-485:
------------------------------------------

Hi David,
thank you for your feedback. If I get you correctly then I would advise one of following options (higher option means better one):

1) Upgrade to latest version (1.4.1) and use incremental feature: http://sqoop.apache.org/docs/1.4.1-incubating/SqoopUserGuide.html#_incremental_imports

2) Remove parameter --hive-import and set --target-dir pointing directly to hive warehouse directory (by default /user/hive/warehouse/$db.db/$table).

I believe that your case is not a bug at the moment. Could you please take this discussion out to mailing list user@sqoop.apache.org? It seems to me as better place to discuss your issue.

Jarcec
                
> --hive-import cannot be used with --query
> -----------------------------------------
>
>                 Key: SQOOP-485
>                 URL: https://issues.apache.org/jira/browse/SQOOP-485
>             Project: Sqoop
>          Issue Type: Bug
>          Components: hive-integration
>    Affects Versions: 1.3.0
>         Environment: :/home/w17626> uname -a
> Linux il93rhel91 2.6.18-128.2.1.el5 #1 SMP Wed Jul 8 11:54:47 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: David Randolph
>
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> produces the following output:
> {noformat}
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> 12/05/15 11:10:10 DEBUG tool.BaseSqoopTool: Enabled debug logging.
> 12/05/15 11:10:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
> 12/05/15 11:10:10 DEBUG sqoop.Sqoop: Must specify destination with --target-dir.
> Try --help for usage instructions.
> Must specify destination with --target-dir.
> Try --help for usage instructions.
>         at com.cloudera.sqoop.tool.ImportTool.validateImportOptions(ImportTool.java:823)
>         at com.cloudera.sqoop.tool.ImportTool.validateOptions(ImportTool.java:890)
>         at com.cloudera.sqoop.Sqoop.run(Sqoop.java:132)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> Must specify destination with --target-dir.
> Try --help for usage instructions.
> <snip>
> {noformat}
> I can work around this immediate failure by specifying the --target-dir flag. This seems to do the right thing, but the sqoop command hangs. Also, use of --target-dir with --hive-import is discouraged in SQOOP-464.
> How can I use the --query flag to import into Hive?
> :> sqoop version
> Sqoop 1.3.0-cdh3u3
> git commit id 57cbc8d38cc5ff22a24d34c3d13f9862fd1372bb
> Compiled by jenkins@ubuntu-slave02 on Thu Jan 26 10:41:26 PST 2012
> Thanks,
> Dave

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Comment Edited] (SQOOP-485) --hive-import cannot be used with --query

Posted by "David Randolph (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SQOOP-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280419#comment-13280419 ] 

David Randolph edited comment on SQOOP-485 at 5/25/12 7:00 AM:
---------------------------------------------------------------

I have confirmed that one can add rows to Hive tables that already contain rows with a command like this:
{noformat}
hive -e "load data inpath '/user/me/fact/day5' into table fact"
{noformat}
The pre-existing data are still visible in Hive.

Note that this is not really material to the issue raised here. This issue is raised to remove the apparent incompatibility of the --hive-import and --query flags. These flags should be compatible.

Thanks.
                
      was (Author: rndlph):
    I have confirmed that one can add rows to Hive tables with a command like this:
{noformat}
hive -e "load data inpath '/user/me/fact/day5' into table fact"
{noformat}
Note that this is not really material to the issue raised here. This issue is raised to remove the apparent incompatibility of the --hive-import and --query flags. These flags should be compatible.

Thanks.
                  
> --hive-import cannot be used with --query
> -----------------------------------------
>
>                 Key: SQOOP-485
>                 URL: https://issues.apache.org/jira/browse/SQOOP-485
>             Project: Sqoop
>          Issue Type: Bug
>          Components: hive-integration
>    Affects Versions: 1.3.0
>         Environment: :/home/w17626> uname -a
> Linux il93rhel91 2.6.18-128.2.1.el5 #1 SMP Wed Jul 8 11:54:47 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: David Randolph
>
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> produces the following output:
> {noformat}
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> 12/05/15 11:10:10 DEBUG tool.BaseSqoopTool: Enabled debug logging.
> 12/05/15 11:10:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
> 12/05/15 11:10:10 DEBUG sqoop.Sqoop: Must specify destination with --target-dir.
> Try --help for usage instructions.
> Must specify destination with --target-dir.
> Try --help for usage instructions.
>         at com.cloudera.sqoop.tool.ImportTool.validateImportOptions(ImportTool.java:823)
>         at com.cloudera.sqoop.tool.ImportTool.validateOptions(ImportTool.java:890)
>         at com.cloudera.sqoop.Sqoop.run(Sqoop.java:132)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> Must specify destination with --target-dir.
> Try --help for usage instructions.
> <snip>
> {noformat}
> I can work around this immediate failure by specifying the --target-dir flag. This seems to do the right thing, but the sqoop command hangs. Also, use of --target-dir with --hive-import is discouraged in SQOOP-464.
> How can I use the --query flag to import into Hive?
> :> sqoop version
> Sqoop 1.3.0-cdh3u3
> git commit id 57cbc8d38cc5ff22a24d34c3d13f9862fd1372bb
> Compiled by jenkins@ubuntu-slave02 on Thu Jan 26 10:41:26 PST 2012
> Thanks,
> Dave

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (SQOOP-485) --hive-import cannot be used with --query

Posted by "David Randolph (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SQOOP-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279735#comment-13279735 ] 

David Randolph commented on SQOOP-485:
--------------------------------------

I was just saying you should be able to define what is being imported using an SQL query. In my view, this is basic functionality for a Hive import.

Regarding your other comments:

# Yes, we only get one chance to define the Hive table. Would it be updated dynamically using the incremental feature?
# You are correct: the --append is superfluous in this context. But it does no harm, as far as I can tell.
# -
# In my tests, this command does in fact add rows to the existing Hive tables. I will be testing this further in the coming days.

Thanks,
Dave



                
> --hive-import cannot be used with --query
> -----------------------------------------
>
>                 Key: SQOOP-485
>                 URL: https://issues.apache.org/jira/browse/SQOOP-485
>             Project: Sqoop
>          Issue Type: Bug
>          Components: hive-integration
>    Affects Versions: 1.3.0
>         Environment: :/home/w17626> uname -a
> Linux il93rhel91 2.6.18-128.2.1.el5 #1 SMP Wed Jul 8 11:54:47 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: David Randolph
>
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> produces the following output:
> {noformat}
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> 12/05/15 11:10:10 DEBUG tool.BaseSqoopTool: Enabled debug logging.
> 12/05/15 11:10:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
> 12/05/15 11:10:10 DEBUG sqoop.Sqoop: Must specify destination with --target-dir.
> Try --help for usage instructions.
> Must specify destination with --target-dir.
> Try --help for usage instructions.
>         at com.cloudera.sqoop.tool.ImportTool.validateImportOptions(ImportTool.java:823)
>         at com.cloudera.sqoop.tool.ImportTool.validateOptions(ImportTool.java:890)
>         at com.cloudera.sqoop.Sqoop.run(Sqoop.java:132)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> Must specify destination with --target-dir.
> Try --help for usage instructions.
> <snip>
> {noformat}
> I can work around this immediate failure by specifying the --target-dir flag. This seems to do the right thing, but the sqoop command hangs. Also, use of --target-dir with --hive-import is discouraged in SQOOP-464.
> How can I use the --query flag to import into Hive?
> :> sqoop version
> Sqoop 1.3.0-cdh3u3
> git commit id 57cbc8d38cc5ff22a24d34c3d13f9862fd1372bb
> Compiled by jenkins@ubuntu-slave02 on Thu Jan 26 10:41:26 PST 2012
> Thanks,
> Dave

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Comment Edited] (SQOOP-485) --hive-import cannot be used with --query

Posted by "David Randolph (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SQOOP-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276228#comment-13276228 ] 

David Randolph edited comment on SQOOP-485 at 5/15/12 9:34 PM:
---------------------------------------------------------------

If sqoop supports a direct import to Hive, I think the user should be able to specify an SQL query to define what data are imported. You can call this a missing feature instead of a bug if you prefer, but it seems like core functionality to me.

Regarding option #1, we are using the sqoop provided with Cloudera Manager. I don't think we can upgrade until a new version is included in the Cloudera soup.

Your suggested workaround (#2) would not provide me with a functioning Hive warehouse. The complete workaround, I think, looks something like this:

# sqoop create-hive-table --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --table FACT --username HADOOP --password $pass
# sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password $pass --query 'select * from fact where day = 5 and $CONDITIONS' --fields-terminated-by , --escaped-by \\ -m1 --target-dir /user/me/fact --append
# hadoop fs -mv /user/me/fact/part-m-00000 /user/me/fact/day5
# hive -e "load data inpath '/user/me/fact/day5' into table fact"

This will move the files from the target-dir to /user/hive/warehouse/fact, and these data will be usable in hive.

Thanks,
Dave
                
      was (Author: rndlph):
    If sqoop supports a direct import to Hive, I think the user should be able to specify an SQL query to define what data are imported. You can call this a missing feature instead of a bug if you prefer, but it seems like core functionality to me.

Your suggested workaround would not provide me with a functioning Hive warehouse. The complete workaround, I think, looks something like this:

# sqoop create-hive-table --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --table FACT --username HADOOP --password $pass
# sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password $pass --query 'select * from fact where day = 5 and $CONDITIONS' --fields-terminated-by , --escaped-by \\ -m1 --target-dir /user/me/fact --append
# hadoop fs -mv /user/me/fact/part-m-00000 /user/me/fact/day5
# hive -e "load data inpath '/user/me/fact/day5' into table fact"

This will move the files from the target-dir to /user/hive/warehouse/fact, and these data will be usable in hive.

Thanks,
Dave
                  
> --hive-import cannot be used with --query
> -----------------------------------------
>
>                 Key: SQOOP-485
>                 URL: https://issues.apache.org/jira/browse/SQOOP-485
>             Project: Sqoop
>          Issue Type: Bug
>          Components: hive-integration
>    Affects Versions: 1.3.0
>         Environment: :/home/w17626> uname -a
> Linux il93rhel91 2.6.18-128.2.1.el5 #1 SMP Wed Jul 8 11:54:47 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: David Randolph
>
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> produces the following output:
> {noformat}
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> 12/05/15 11:10:10 DEBUG tool.BaseSqoopTool: Enabled debug logging.
> 12/05/15 11:10:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
> 12/05/15 11:10:10 DEBUG sqoop.Sqoop: Must specify destination with --target-dir.
> Try --help for usage instructions.
> Must specify destination with --target-dir.
> Try --help for usage instructions.
>         at com.cloudera.sqoop.tool.ImportTool.validateImportOptions(ImportTool.java:823)
>         at com.cloudera.sqoop.tool.ImportTool.validateOptions(ImportTool.java:890)
>         at com.cloudera.sqoop.Sqoop.run(Sqoop.java:132)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> Must specify destination with --target-dir.
> Try --help for usage instructions.
> <snip>
> {noformat}
> I can work around this immediate failure by specifying the --target-dir flag. This seems to do the right thing, but the sqoop command hangs. Also, use of --target-dir with --hive-import is discouraged in SQOOP-464.
> How can I use the --query flag to import into Hive?
> :> sqoop version
> Sqoop 1.3.0-cdh3u3
> git commit id 57cbc8d38cc5ff22a24d34c3d13f9862fd1372bb
> Compiled by jenkins@ubuntu-slave02 on Thu Jan 26 10:41:26 PST 2012
> Thanks,
> Dave

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (SQOOP-485) --hive-import cannot be used with --query

Posted by "Jarek Jarcec Cecho (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SQOOP-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276098#comment-13276098 ] 

Jarek Jarcec Cecho commented on SQOOP-485:
------------------------------------------

Hi Dave,
can you please describe your use case? I'm afraid that --hive-import and --append weren't designed to work together.

Jarcec
                
> --hive-import cannot be used with --query
> -----------------------------------------
>
>                 Key: SQOOP-485
>                 URL: https://issues.apache.org/jira/browse/SQOOP-485
>             Project: Sqoop
>          Issue Type: Bug
>          Components: hive-integration
>    Affects Versions: 1.3.0
>         Environment: :/home/w17626> uname -a
> Linux il93rhel91 2.6.18-128.2.1.el5 #1 SMP Wed Jul 8 11:54:47 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: David Randolph
>
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> produces the following output:
> {noformat}
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> 12/05/15 11:10:10 DEBUG tool.BaseSqoopTool: Enabled debug logging.
> 12/05/15 11:10:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
> 12/05/15 11:10:10 DEBUG sqoop.Sqoop: Must specify destination with --target-dir.
> Try --help for usage instructions.
> Must specify destination with --target-dir.
> Try --help for usage instructions.
>         at com.cloudera.sqoop.tool.ImportTool.validateImportOptions(ImportTool.java:823)
>         at com.cloudera.sqoop.tool.ImportTool.validateOptions(ImportTool.java:890)
>         at com.cloudera.sqoop.Sqoop.run(Sqoop.java:132)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> Must specify destination with --target-dir.
> Try --help for usage instructions.
> <snip>
> {noformat}
> I can work around this immediate failure by specifying the --target-dir flag. This seems to do the right thing, but the sqoop command hangs. Also, use of --target-dir with --hive-import is discouraged in SQOOP-464.
> How can I use the --query flag to import into Hive?
> :> sqoop version
> Sqoop 1.3.0-cdh3u3
> git commit id 57cbc8d38cc5ff22a24d34c3d13f9862fd1372bb
> Compiled by jenkins@ubuntu-slave02 on Thu Jan 26 10:41:26 PST 2012
> Thanks,
> Dave

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (SQOOP-485) --hive-import cannot be used with --query

Posted by "Jarek Jarcec Cecho (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SQOOP-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283193#comment-13283193 ] 

Jarek Jarcec Cecho commented on SQOOP-485:
------------------------------------------

Hi David,
thank you very much for your feedback. You're indeed right and hive do support loading differently named files.

For you comment about apparent incompatibility between --hive-import and --query arguments. They are fully compatible and I'm using them on daily basis. The problem is they were designed for just one time import. You're trying to use them in sort of incremental import, which is not supported as sqoop have explicit incremental support done with --incremental argument.

Another way that might help you easily is to make the table in HIVE partitioned and export your daily partitions.

Jarcec
                
> --hive-import cannot be used with --query
> -----------------------------------------
>
>                 Key: SQOOP-485
>                 URL: https://issues.apache.org/jira/browse/SQOOP-485
>             Project: Sqoop
>          Issue Type: Bug
>          Components: hive-integration
>    Affects Versions: 1.3.0
>         Environment: :/home/w17626> uname -a
> Linux il93rhel91 2.6.18-128.2.1.el5 #1 SMP Wed Jul 8 11:54:47 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: David Randolph
>
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> produces the following output:
> {noformat}
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> 12/05/15 11:10:10 DEBUG tool.BaseSqoopTool: Enabled debug logging.
> 12/05/15 11:10:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
> 12/05/15 11:10:10 DEBUG sqoop.Sqoop: Must specify destination with --target-dir.
> Try --help for usage instructions.
> Must specify destination with --target-dir.
> Try --help for usage instructions.
>         at com.cloudera.sqoop.tool.ImportTool.validateImportOptions(ImportTool.java:823)
>         at com.cloudera.sqoop.tool.ImportTool.validateOptions(ImportTool.java:890)
>         at com.cloudera.sqoop.Sqoop.run(Sqoop.java:132)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> Must specify destination with --target-dir.
> Try --help for usage instructions.
> <snip>
> {noformat}
> I can work around this immediate failure by specifying the --target-dir flag. This seems to do the right thing, but the sqoop command hangs. Also, use of --target-dir with --hive-import is discouraged in SQOOP-464.
> How can I use the --query flag to import into Hive?
> :> sqoop version
> Sqoop 1.3.0-cdh3u3
> git commit id 57cbc8d38cc5ff22a24d34c3d13f9862fd1372bb
> Compiled by jenkins@ubuntu-slave02 on Thu Jan 26 10:41:26 PST 2012
> Thanks,
> Dave

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (SQOOP-485) --hive-import cannot be used with --query

Posted by "Jarek Jarcec Cecho (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SQOOP-485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jarek Jarcec Cecho resolved SQOOP-485.
--------------------------------------

    Resolution: Invalid

This issue seems as unsupported usage that have supported workarounds, so I'm closing it for now. Please do not hesitate to open it if needed.

Jarcec
                
> --hive-import cannot be used with --query
> -----------------------------------------
>
>                 Key: SQOOP-485
>                 URL: https://issues.apache.org/jira/browse/SQOOP-485
>             Project: Sqoop
>          Issue Type: Bug
>          Components: hive-integration
>    Affects Versions: 1.3.0
>         Environment: :/home/w17626> uname -a
> Linux il93rhel91 2.6.18-128.2.1.el5 #1 SMP Wed Jul 8 11:54:47 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: David Randolph
>
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> produces the following output:
> {noformat}
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> 12/05/15 11:10:10 DEBUG tool.BaseSqoopTool: Enabled debug logging.
> 12/05/15 11:10:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
> 12/05/15 11:10:10 DEBUG sqoop.Sqoop: Must specify destination with --target-dir.
> Try --help for usage instructions.
> Must specify destination with --target-dir.
> Try --help for usage instructions.
>         at com.cloudera.sqoop.tool.ImportTool.validateImportOptions(ImportTool.java:823)
>         at com.cloudera.sqoop.tool.ImportTool.validateOptions(ImportTool.java:890)
>         at com.cloudera.sqoop.Sqoop.run(Sqoop.java:132)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> Must specify destination with --target-dir.
> Try --help for usage instructions.
> <snip>
> {noformat}
> I can work around this immediate failure by specifying the --target-dir flag. This seems to do the right thing, but the sqoop command hangs. Also, use of --target-dir with --hive-import is discouraged in SQOOP-464.
> How can I use the --query flag to import into Hive?
> :> sqoop version
> Sqoop 1.3.0-cdh3u3
> git commit id 57cbc8d38cc5ff22a24d34c3d13f9862fd1372bb
> Compiled by jenkins@ubuntu-slave02 on Thu Jan 26 10:41:26 PST 2012
> Thanks,
> Dave

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (SQOOP-485) --hive-import cannot be used with --query

Posted by "Jarek Jarcec Cecho (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SQOOP-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279703#comment-13279703 ] 

Jarek Jarcec Cecho commented on SQOOP-485:
------------------------------------------

Hi Dave,
I'm using sqoop to do hive imports (both full and incremental) on daily basis, so I believe that this functionality is working properly :-) I just tried to express that the usage is little bit different.

I do have couple of notes to your extended variant of my workaround.

1. This command will create table's metadata in hive only once on first execution. Any metadata changes to table on Oracle side (adding column, removing column, ...) won't be reflected (updated) in hive on any of the subsequent calls.
2. Running sqoop with --append won't give you any extended functionality here as it appears that you're importing into empty directory. Parameter --append make sense only in case that you're doing sort of incremental import do directory that already contains some files.
3. -
4. As far as I know, hive do not allow you to execute this command if the table fact already contains any data. You need to use "OVERRIDE" keyword, but then you'll loose previous content of the table. I believe that this is not desired, right?


                
> --hive-import cannot be used with --query
> -----------------------------------------
>
>                 Key: SQOOP-485
>                 URL: https://issues.apache.org/jira/browse/SQOOP-485
>             Project: Sqoop
>          Issue Type: Bug
>          Components: hive-integration
>    Affects Versions: 1.3.0
>         Environment: :/home/w17626> uname -a
> Linux il93rhel91 2.6.18-128.2.1.el5 #1 SMP Wed Jul 8 11:54:47 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: David Randolph
>
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> produces the following output:
> {noformat}
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> 12/05/15 11:10:10 DEBUG tool.BaseSqoopTool: Enabled debug logging.
> 12/05/15 11:10:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
> 12/05/15 11:10:10 DEBUG sqoop.Sqoop: Must specify destination with --target-dir.
> Try --help for usage instructions.
> Must specify destination with --target-dir.
> Try --help for usage instructions.
>         at com.cloudera.sqoop.tool.ImportTool.validateImportOptions(ImportTool.java:823)
>         at com.cloudera.sqoop.tool.ImportTool.validateOptions(ImportTool.java:890)
>         at com.cloudera.sqoop.Sqoop.run(Sqoop.java:132)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> Must specify destination with --target-dir.
> Try --help for usage instructions.
> <snip>
> {noformat}
> I can work around this immediate failure by specifying the --target-dir flag. This seems to do the right thing, but the sqoop command hangs. Also, use of --target-dir with --hive-import is discouraged in SQOOP-464.
> How can I use the --query flag to import into Hive?
> :> sqoop version
> Sqoop 1.3.0-cdh3u3
> git commit id 57cbc8d38cc5ff22a24d34c3d13f9862fd1372bb
> Compiled by jenkins@ubuntu-slave02 on Thu Jan 26 10:41:26 PST 2012
> Thanks,
> Dave

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (SQOOP-485) --hive-import cannot be used with --query

Posted by "David Randolph (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SQOOP-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280419#comment-13280419 ] 

David Randolph commented on SQOOP-485:
--------------------------------------

I have confirmed that one can add rows to Hive tables with a command like this:
{noformat}
hive -e "load data inpath '/user/me/fact/day5' into table fact"
{noformat}
Note that this is not really material to the issue raised here. This issue is raised to remove the apparent incompatibility of the --hive-import and --query flags. These flags should be compatible.

Thanks.
                
> --hive-import cannot be used with --query
> -----------------------------------------
>
>                 Key: SQOOP-485
>                 URL: https://issues.apache.org/jira/browse/SQOOP-485
>             Project: Sqoop
>          Issue Type: Bug
>          Components: hive-integration
>    Affects Versions: 1.3.0
>         Environment: :/home/w17626> uname -a
> Linux il93rhel91 2.6.18-128.2.1.el5 #1 SMP Wed Jul 8 11:54:47 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: David Randolph
>
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> produces the following output:
> {noformat}
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> 12/05/15 11:10:10 DEBUG tool.BaseSqoopTool: Enabled debug logging.
> 12/05/15 11:10:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
> 12/05/15 11:10:10 DEBUG sqoop.Sqoop: Must specify destination with --target-dir.
> Try --help for usage instructions.
> Must specify destination with --target-dir.
> Try --help for usage instructions.
>         at com.cloudera.sqoop.tool.ImportTool.validateImportOptions(ImportTool.java:823)
>         at com.cloudera.sqoop.tool.ImportTool.validateOptions(ImportTool.java:890)
>         at com.cloudera.sqoop.Sqoop.run(Sqoop.java:132)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> Must specify destination with --target-dir.
> Try --help for usage instructions.
> <snip>
> {noformat}
> I can work around this immediate failure by specifying the --target-dir flag. This seems to do the right thing, but the sqoop command hangs. Also, use of --target-dir with --hive-import is discouraged in SQOOP-464.
> How can I use the --query flag to import into Hive?
> :> sqoop version
> Sqoop 1.3.0-cdh3u3
> git commit id 57cbc8d38cc5ff22a24d34c3d13f9862fd1372bb
> Compiled by jenkins@ubuntu-slave02 on Thu Jan 26 10:41:26 PST 2012
> Thanks,
> Dave

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (SQOOP-485) --hive-import cannot be used with --query

Posted by "David Randolph (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SQOOP-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276188#comment-13276188 ] 

David Randolph commented on SQOOP-485:
--------------------------------------

If I take --append off the command, it still fails immediately.

The use case is simply to import a set of new data to a particular Hive warehouse table from Oracle at some interval. Except for the hanging of the command, running with the --target-dir flag does exactly what is needed.
                
> --hive-import cannot be used with --query
> -----------------------------------------
>
>                 Key: SQOOP-485
>                 URL: https://issues.apache.org/jira/browse/SQOOP-485
>             Project: Sqoop
>          Issue Type: Bug
>          Components: hive-integration
>    Affects Versions: 1.3.0
>         Environment: :/home/w17626> uname -a
> Linux il93rhel91 2.6.18-128.2.1.el5 #1 SMP Wed Jul 8 11:54:47 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: David Randolph
>
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> produces the following output:
> {noformat}
> sqoop import --verbose --connect jdbc:oracle:thin:@//oss-devdb2.pcs.mot.com/TOOLS --username HADOOP --password xxx --query 'select * from fact where id < 11 and $CONDITIONS' --hive-import --hive-table fact --fields-terminated-by , --escaped-by \\ -m1 --append
> 12/05/15 11:10:10 DEBUG tool.BaseSqoopTool: Enabled debug logging.
> 12/05/15 11:10:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
> 12/05/15 11:10:10 DEBUG sqoop.Sqoop: Must specify destination with --target-dir.
> Try --help for usage instructions.
> Must specify destination with --target-dir.
> Try --help for usage instructions.
>         at com.cloudera.sqoop.tool.ImportTool.validateImportOptions(ImportTool.java:823)
>         at com.cloudera.sqoop.tool.ImportTool.validateOptions(ImportTool.java:890)
>         at com.cloudera.sqoop.Sqoop.run(Sqoop.java:132)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> Must specify destination with --target-dir.
> Try --help for usage instructions.
> <snip>
> {noformat}
> I can work around this immediate failure by specifying the --target-dir flag. This seems to do the right thing, but the sqoop command hangs. Also, use of --target-dir with --hive-import is discouraged in SQOOP-464.
> How can I use the --query flag to import into Hive?
> :> sqoop version
> Sqoop 1.3.0-cdh3u3
> git commit id 57cbc8d38cc5ff22a24d34c3d13f9862fd1372bb
> Compiled by jenkins@ubuntu-slave02 on Thu Jan 26 10:41:26 PST 2012
> Thanks,
> Dave

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira