You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@sqoop.apache.org by Marcin Cylke <mc...@touk.pl> on 2012/03/01 14:56:23 UTC

importing to hive with partitions

Hi
I'm trying to import some data from DB to hive tables. My sqoop version 
is 1.4.1-incubating__hadoop-1.0.0 
(http://ftp.tpnet.pl/vol/d1/apache//incubator/sqoop/sqoop-1.4.1-incubating/sqoop-1.4.1-incubating__hadoop-1.0.0.tar.gz). 
I use this command:

bin/sqoop import --driver oracle.jdbc.driver.OracleDriver
           --connect jdbc:oracle:thin:@172.16.17.232:1521:dwh
           --username had_test
           --password abc1234
           --query "select table_name, tablespace_name, cluster_name, 
iot_name, status, pct_free
, pct_used from all_tables where owner = 'ME' and \$CONDITIONS"
           --num-mappers 1
           --hive-import
           --hive-table "sample_rm3"
           --hive-partition-key 'OWNER'
           --hive-partition-value 'ME'
           --target-dir /user/hive/warehouse/sample_rm3
           --mysql-delimiters
           --verbose

and am getting the following error when loading data to partitions:

12/03/01 14:44:38 INFO hive.HiveImport: Loading data to table 
default.sample_rm3 partition (owner=ME)
12/03/01 14:44:38 INFO hive.HiveImport: Failed with exception 
checkPaths: hdfs://hadoop1:9000/user/hive/warehouse/sample_rm3 has 
nested directoryhdfs://hadoop1:9000/user/hive/warehouse/sample_rm3/owner=ME
12/03/01 14:44:38 INFO hive.HiveImport: FAILED: Execution Error, return 
code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
12/03/01 14:44:38 ERROR tool.ImportTool: Encountered IOException running 
import job: java.io.IOException: Hive exited with status 9
         at 
org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:347)
         at 
org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:297)
         at 
org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:239)
         at 
org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:392)
         at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:453)
         at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
         at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
         at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57)


Is this a known issue, is there some kind of workaround?

It seems that this topic is loosely connected to this JIRA ticket: 
https://issues.apache.org/jira/browse/SQOOP-312 Are there any plans on 
commiting such support in sqoop?

Best regards
Marcin

Re: [ sqoopusers ] Re: importing to hive with partitions

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
You're welcome sir.

Jarcec

On Mon, Mar 05, 2012 at 11:47:35AM +0100, Marcin Cylke wrote:
> On 05/03/12 11:13, Jarek Jarcec Cecho wrote:
> >Hi Marcin,
> >that's actually my bad. You've used query import, so you need to specify --target-dir option. Can you just try to call it with "--target-dir sample_rm3" (this directory will be created in your home directory on HDFS) or other path if needed.
> 
> Thanks, this solved the problem and created the table successfuly.
> 
> Marcin

Re: [ sqoopusers ] Re: importing to hive with partitions

Posted by Marcin Cylke <mc...@touk.pl>.
On 05/03/12 11:13, Jarek Jarcec Cecho wrote:
> Hi Marcin,
> that's actually my bad. You've used query import, so you need to specify --target-dir option. Can you just try to call it with "--target-dir sample_rm3" (this directory will be created in your home directory on HDFS) or other path if needed.

Thanks, this solved the problem and created the table successfuly.

Marcin

Re: importing to hive with partitions

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Hi Marcin,
that's actually my bad. You've used query import, so you need to specify --target-dir option. Can you just try to call it with "--target-dir sample_rm3" (this directory will be created in your home directory on HDFS) or other path if needed.

Jarcec

On Mon, Mar 05, 2012 at 11:01:36AM +0100, Marcin Cylke wrote:
> On 04/03/12 13:38, Jarek Jarcec Cecho wrote:
> >Hi Marcin,
> >can you try running the command without --target-dir parameter?
> 
> No, I'm unable to do this, because I'm getting an error:
> 
> Must specify destination with --target-dir.
> 
> >It seems to me that you're trying to save sqoop output directly to HIVE warehouse directory, however sqoop is firstly exporting data to different HDFS directory and then calling hive for creating table and loading (moving) data.
> 
> However reasonable this looks, the CLI options' parsing keeps me
> from doing so.
> 
> Any other advice? Or perhaps some kind of fix is necessary?
> 
> Regards
> Marcin

Re: importing to hive with partitions

Posted by Marcin Cylke <mc...@touk.pl>.
On 04/03/12 13:38, Jarek Jarcec Cecho wrote:
> Hi Marcin,
> can you try running the command without --target-dir parameter?

No, I'm unable to do this, because I'm getting an error:

Must specify destination with --target-dir.

> It seems to me that you're trying to save sqoop output directly to HIVE warehouse directory, however sqoop is firstly exporting data to different HDFS directory and then calling hive for creating table and loading (moving) data.

However reasonable this looks, the CLI options' parsing keeps me from 
doing so.

Any other advice? Or perhaps some kind of fix is necessary?

Regards
Marcin

Re: importing to hive with partitions

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Hi Marcin,
can you try running the command without --target-dir parameter? 

It seems to me that you're trying to save sqoop output directly to HIVE warehouse directory, however sqoop is firstly exporting data to different HDFS directory and then calling hive for creating table and loading (moving) data.

Jarcec

On Thu, Mar 01, 2012 at 02:56:23PM +0100, Marcin Cylke wrote:
> Hi
> I'm trying to import some data from DB to hive tables. My sqoop
> version is 1.4.1-incubating__hadoop-1.0.0 (http://ftp.tpnet.pl/vol/d1/apache//incubator/sqoop/sqoop-1.4.1-incubating/sqoop-1.4.1-incubating__hadoop-1.0.0.tar.gz).
> I use this command:
> 
> bin/sqoop import --driver oracle.jdbc.driver.OracleDriver
>           --connect jdbc:oracle:thin:@172.16.17.232:1521:dwh
>           --username had_test
>           --password abc1234
>           --query "select table_name, tablespace_name, cluster_name,
> iot_name, status, pct_free
> , pct_used from all_tables where owner = 'ME' and \$CONDITIONS"
>           --num-mappers 1
>           --hive-import
>           --hive-table "sample_rm3"
>           --hive-partition-key 'OWNER'
>           --hive-partition-value 'ME'
>           --target-dir /user/hive/warehouse/sample_rm3
>           --mysql-delimiters
>           --verbose
> 
> and am getting the following error when loading data to partitions:
> 
> 12/03/01 14:44:38 INFO hive.HiveImport: Loading data to table
> default.sample_rm3 partition (owner=ME)
> 12/03/01 14:44:38 INFO hive.HiveImport: Failed with exception
> checkPaths: hdfs://hadoop1:9000/user/hive/warehouse/sample_rm3 has
> nested
> directoryhdfs://hadoop1:9000/user/hive/warehouse/sample_rm3/owner=ME
> 12/03/01 14:44:38 INFO hive.HiveImport: FAILED: Execution Error,
> return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
> 12/03/01 14:44:38 ERROR tool.ImportTool: Encountered IOException
> running import job: java.io.IOException: Hive exited with status 9
>         at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:347)
>         at
> org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:297)
>         at
> org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:239)
>         at
> org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:392)
>         at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:453)
>         at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
>         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
>         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
>         at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57)
> 
> 
> Is this a known issue, is there some kind of workaround?
> 
> It seems that this topic is loosely connected to this JIRA ticket:
> https://issues.apache.org/jira/browse/SQOOP-312 Are there any plans
> on commiting such support in sqoop?
> 
> Best regards
> Marcin