You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@sqoop.apache.org by Thomas Lété <th...@soprism.com> on 2015/01/15 09:41:59 UTC

Sqoop Parquet incremental import “Cannot append files to target dir”

Hi Everyone !
I have a problem while importing data from MySQL to Hive as Parquet using Sqoop… (I had no problem with text files)

This query :

	sqoop import --connect jdbc:mysql://xx.xx.xx.xx/database --username sqoop --password sqoop --table datatable --target-dir /home/cloudera/user/hive/warehouse/database.db/datatable --as-parquetfile -m 1 --append
Return this error :

	15/01/14 16:27:28 WARN util.AppendUtils: Cannot append files to target dir; no such directory: _sqoop/14162350000000781_32315_servername.ip-xx-xx-xx.eu_datatable
Files are located in /user/root/_sqoop/ this way : /user/root/_sqoop/14162350000000781_32315_servername/ip-xx-xx-xx/eu_datatable/

Is it normal that dots from the hostname are replaced by slashes ? It seems to be the problem but noone is complaining about this problem...

Or there is an other way to append data to a parquet file ?

PS : I’m using CDH5.3 which has Sqoop 1.4.5 built in.
Thank you !

RE: Sqoop Parquet incremental import “Cannot append files to target dir”

Posted by "Xu, Qian A" <qi...@intel.com>.
Hi Thomas,

I've attached a patch at https://issues.apache.org/jira/browse/SQOOP-2149. It will be great, if you can confirm that it works for you.

 --Stanley (Qian) Xu

-----Original Message-----
From: Thomas Lété [mailto:thomas.lete@soprism.com] 
Sent: Friday, January 16, 2015 1:00 AM
To: user@sqoop.apache.org
Subject: Re: Sqoop Parquet incremental import “Cannot append files to target dir”

Hi Qian,

Thank you for your reply, unfortunately, this table is named datatable (--table datatable).

The file sqoop attempts to create is simply named using the hostname of the server, which seems to be the problem.

I can send you a complete verbose log if this can help…

Thomas

> Le 15 janv. 2015 à 16:42, Xu, Qian A <qi...@intel.com> a écrit :
> 
> HI Thomas,
>  
> Sqoop Parquet support uses Kite SDK. Kite handles hive access differently than Sqoop.
>  
> Hive table does not allow dot in name, so any dot will be replaced with splash. This was a known behavior of Kite 0.16. https://issues.cloudera.org/browse/CDK-650
>  
> I’m not sure if SQOOP 1.4.5 uses Kite 0.17 already. Please fire a jira, I will take a look into it.
>  
> --Stanley (Qian) Xu
>  
>  
> From: Thomas Lété [mailto:thomas.lete@soprism.com] 
> Sent: Thursday, January 15, 2015 4:42 PM
> To: user@sqoop.apache.org
> Subject: Sqoop Parquet incremental import “Cannot append files to target dir”
>  
> Hi Everyone !
> I have a problem while importing data from MySQL to Hive as Parquet using Sqoop… (I had no problem with text files)
> This query :
>         sqoop import --connect jdbc:mysql://xx.xx.xx.xx/database --username sqoop --password sqoop --table datatable --target-dir /home/cloudera/user/hive/warehouse/database.db/datatable --as-parquetfile -m 1 --append
> Return this error :
>         15/01/14 16:27:28 WARN util.AppendUtils: Cannot append files to target dir; no such directory: _sqoop/14162350000000781_32315_servername.ip-xx-xx-xx.eu_datatable
> Files are located in /user/root/_sqoop/ this way : /user/root/_sqoop/14162350000000781_32315_servername/ip-xx-xx-xx/eu_datatable/
> Is it normal that dots from the hostname are replaced by slashes ? It seems to be the problem but noone is complaining about this problem...
> Or there is an other way to append data to a parquet file ?
>  
> PS : I’m using CDH5.3 which has Sqoop 1.4.5 built in.
> Thank you !


RE: Sqoop Parquet incremental import “Cannot append files to target dir”

Posted by "Xu, Qian A" <qi...@intel.com>.
Hi Thomas,

Don’t worry. Please create a jira at https://issues.apache.org/jira/browse/SQOOP and post verbose log etc. 

Qian


-----Original Message-----
From: Thomas Lété [mailto:thomas.lete@soprism.com] 
Sent: Friday, January 16, 2015 1:00 AM
To: user@sqoop.apache.org
Subject: Re: Sqoop Parquet incremental import “Cannot append files to target dir”

Hi Qian,

Thank you for your reply, unfortunately, this table is named datatable (--table datatable).

The file sqoop attempts to create is simply named using the hostname of the server, which seems to be the problem.

I can send you a complete verbose log if this can help…

Thomas

> Le 15 janv. 2015 à 16:42, Xu, Qian A <qi...@intel.com> a écrit :
> 
> HI Thomas,
>  
> Sqoop Parquet support uses Kite SDK. Kite handles hive access differently than Sqoop.
>  
> Hive table does not allow dot in name, so any dot will be replaced with splash. This was a known behavior of Kite 0.16. https://issues.cloudera.org/browse/CDK-650
>  
> I’m not sure if SQOOP 1.4.5 uses Kite 0.17 already. Please fire a jira, I will take a look into it.
>  
> --Stanley (Qian) Xu
>  
>  
> From: Thomas Lété [mailto:thomas.lete@soprism.com] 
> Sent: Thursday, January 15, 2015 4:42 PM
> To: user@sqoop.apache.org
> Subject: Sqoop Parquet incremental import “Cannot append files to target dir”
>  
> Hi Everyone !
> I have a problem while importing data from MySQL to Hive as Parquet using Sqoop… (I had no problem with text files)
> This query :
>         sqoop import --connect jdbc:mysql://xx.xx.xx.xx/database --username sqoop --password sqoop --table datatable --target-dir /home/cloudera/user/hive/warehouse/database.db/datatable --as-parquetfile -m 1 --append
> Return this error :
>         15/01/14 16:27:28 WARN util.AppendUtils: Cannot append files to target dir; no such directory: _sqoop/14162350000000781_32315_servername.ip-xx-xx-xx.eu_datatable
> Files are located in /user/root/_sqoop/ this way : /user/root/_sqoop/14162350000000781_32315_servername/ip-xx-xx-xx/eu_datatable/
> Is it normal that dots from the hostname are replaced by slashes ? It seems to be the problem but noone is complaining about this problem...
> Or there is an other way to append data to a parquet file ?
>  
> PS : I’m using CDH5.3 which has Sqoop 1.4.5 built in.
> Thank you !


Re: Sqoop Parquet incremental import “Cannot append files to target dir”

Posted by Thomas Lété <th...@soprism.com>.
Hi Qian,

Thank you for your reply, unfortunately, this table is named datatable (--table datatable).

The file sqoop attempts to create is simply named using the hostname of the server, which seems to be the problem.

I can send you a complete verbose log if this can help…

Thomas

> Le 15 janv. 2015 à 16:42, Xu, Qian A <qi...@intel.com> a écrit :
> 
> HI Thomas,
>  
> Sqoop Parquet support uses Kite SDK. Kite handles hive access differently than Sqoop.
>  
> Hive table does not allow dot in name, so any dot will be replaced with splash. This was a known behavior of Kite 0.16. https://issues.cloudera.org/browse/CDK-650
>  
> I’m not sure if SQOOP 1.4.5 uses Kite 0.17 already. Please fire a jira, I will take a look into it.
>  
> --Stanley (Qian) Xu
>  
>  
> From: Thomas Lété [mailto:thomas.lete@soprism.com] 
> Sent: Thursday, January 15, 2015 4:42 PM
> To: user@sqoop.apache.org
> Subject: Sqoop Parquet incremental import “Cannot append files to target dir”
>  
> Hi Everyone !
> I have a problem while importing data from MySQL to Hive as Parquet using Sqoop… (I had no problem with text files)
> This query :
>         sqoop import --connect jdbc:mysql://xx.xx.xx.xx/database --username sqoop --password sqoop --table datatable --target-dir /home/cloudera/user/hive/warehouse/database.db/datatable --as-parquetfile -m 1 --append
> Return this error :
>         15/01/14 16:27:28 WARN util.AppendUtils: Cannot append files to target dir; no such directory: _sqoop/14162350000000781_32315_servername.ip-xx-xx-xx.eu_datatable
> Files are located in /user/root/_sqoop/ this way : /user/root/_sqoop/14162350000000781_32315_servername/ip-xx-xx-xx/eu_datatable/
> Is it normal that dots from the hostname are replaced by slashes ? It seems to be the problem but noone is complaining about this problem...
> Or there is an other way to append data to a parquet file ?
>  
> PS : I’m using CDH5.3 which has Sqoop 1.4.5 built in.
> Thank you !


RE: Sqoop Parquet incremental import “Cannot append files to target dir”

Posted by "Xu, Qian A" <qi...@intel.com>.
HI Thomas,

Sqoop Parquet support uses Kite SDK. Kite handles hive access differently than Sqoop.

Hive table does not allow dot in name, so any dot will be replaced with splash. This was a known behavior of Kite 0.16. https://issues.cloudera.org/browse/CDK-650

I’m not sure if SQOOP 1.4.5 uses Kite 0.17 already. Please fire a jira, I will take a look into it.

--Stanley (Qian) Xu


From: Thomas Lété [mailto:thomas.lete@soprism.com]
Sent: Thursday, January 15, 2015 4:42 PM
To: user@sqoop.apache.org
Subject: Sqoop Parquet incremental import “Cannot append files to target dir”

Hi Everyone !
I have a problem while importing data from MySQL to Hive as Parquet using Sqoop… (I had no problem with text files)
This query :

        sqoop import --connect jdbc:mysql://xx.xx.xx.xx/database --username sqoop --password sqoop --table datatable --target-dir /home/cloudera/user/hive/warehouse/database.db/datatable --as-parquetfile -m 1 --append
Return this error :

        15/01/14 16:27:28 WARN util.AppendUtils: Cannot append files to target dir; no such directory: _sqoop/14162350000000781_32315_servername.ip-xx-xx-xx.eu<http://xx-xx-xx.eu>_datatable
Files are located in /user/root/_sqoop/ this way : /user/root/_sqoop/14162350000000781_32315_servername/ip-xx-xx-xx/eu_datatable/
Is it normal that dots from the hostname are replaced by slashes ? It seems to be the problem but noone is complaining about this problem...
Or there is an other way to append data to a parquet file ?

PS : I’m using CDH5.3 which has Sqoop 1.4.5 built in.
Thank you !