You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@sqoop.apache.org by Rajkumar <ra...@yahoo.co.in> on 2016/03/17 05:25:45 UTC
Sqoop export to SQL Server fails/hangs for more number of columns
Hi folks,
I am trying to export data from HDFS to SQL Server. Theoriginal table has over 500 columns and every time I execute Sqoop export jobit gets stuck showing mapreduce completed at 100%. I created two dummy tablesas shown below to find out where the exact problem persists. The only difference between table1 and table2is that the later has one additional column [col14 varchar(5)] First, I ran export job for Table1 which has 13 columns [datatypevarchar (5)]. The job completed successfully and exported all the 3 records tothe SQL Server. Next, I executed the export job for Table2 with 14 columns. WhenI ran this job, I didn’t see any error messages/exceptions, but it hangsforever after map completed at 100%. The SQL Server Activity Monitor shows aprocess is being created however it’s not receiving any data/prepared statementfrom Hadoop. Is this problem exists only with SQL Server? Is there anylimitation on the number of columns exported to SQL Server? Please advise. ConfigurationHadoop Version – Cloudera 2.6.0-CDH-5.5.2Sqoop Version – 1.4.6SQL Server Version – 2008 R2 Table 1 CREATE TABLE[dbo].[tbldummy1]( [col1] [varchar] (5) NOT NULL, [col2] [varchar](5) NULL, [col3] [varchar](5) NULL, [col4] [varchar](5) NULL, [col5] [varchar](5) NULL, [col6] [varchar](5) NULL, [col7] [varchar](5) NULL, [col8] [varchar](5) NULL, [col9] [varchar](5) NULL, [col10] [varchar](5) NULL, [col11] [varchar](5) NULL, [col12] [varchar](5) NULL, [col13] [varchar](5) NULL, CONSTRAINT [PK_dummy1] PRIMARYKEY ([col1] ASC)) Sqoop Command for Table 1 sqoop export \--connect “jdbc:sqlserver://x.x.x.x:port;database=xxxxxxx” \--username xxxxxx --password yyyyyy \--table tbldummy1 \--export-dir /user/hue/Out1 \--input-fields-terminated-by '|' \-m 1 \--verbose Input data for Out1 aa|01|02|03|04|05|06|07|08|09|10|11|12bb|01|02|03|04|05|06|07|08|09|10|11|12cc|01|02|03|04|05|06|07|08|09|10|11|12 Table 2 CREATE TABLE[dbo].[tbldummy2]( [col1] [varchar] (5) NOT NULL, [col2] [varchar](5) NULL, [col3] [varchar](5) NULL, [col4] [varchar](5) NULL, [col5] [varchar](5) NULL, [col6] [varchar](5) NULL, [col7] [varchar](5) NULL, [col8] [varchar](5) NULL, [col9] [varchar](5) NULL, [col10] [varchar](5) NULL, [col11] [varchar](5) NULL, [col12] [varchar](5) NULL, [col13] [varchar](5) NULL, [col14] [varchar](5) NULL, CONSTRAINT [PK_dummy2] PRIMARYKEY ([col1] ASC)) Sqoop Command for Table 2 sqoop export \--connect "jdbc:sqlserver://x.x.x.x:port;database=xxxxxxx"\--username xxxxxx --password yyyyyy \--table tbldummy2 \--export-dir /user/hue/Out2 \--input-fields-terminated-by '|' \-m 1 \--verbose Input data for Table 2 aa|01|02|03|04|05|06|07|08|09|10|11|12|13bb|01|02|03|04|05|06|07|08|09|10|11|12|13cc|01|02|03|04|05|06|07|08|09|10|11|12|13 Console logs for Table 2 16/03/1623:35:01 INFO mapreduce.Job: Running job: job_1458150283440_002816/03/1623:35:07 INFO mapreduce.Job: Job job_1458150283440_0028 running in uber mode :false16/03/1623:35:07 INFO mapreduce.Job: map 0%reduce 0%16/03/1623:35:18 INFO mapreduce.Job: map 100%reduce 0% Thanks!
Re: Sqoop export to SQL Server fails/hangs for more number of columns
Posted by Ali Thawban <or...@gmail.com>.
What do the hadoop job/attempt logs say?
Kill the job by running
mapred job -kill <job_id>
Get the job *attempt* logs by running
mapred job -kill <job_id> <attempt_id>
Whenever Sqoop fails because of SQL errors - e.g. if a decimal value being
inserted is too big for the column or a varchar field would be truncated,
it'll appear to hang like this, and you'll find the actual SQL error in the
task attempt logs. IMO this is a huge flaw in Sqoop's export feature and I
hate to use Sqoop for exports because of it. My time-sensitive Hadoop job
queues get jammed by Sqoop exports choking on one bad value in one row.
Unless you really need a clustered export to SQL for this, i.e. you're
exporting so many rows that you're exhausting a single *client* machine's
CPU/RAM, you might want to avoid the export feature. I've found it too
immature to be useful, at least in Sqoop1.
Good luck.
On Thu, Mar 17, 2016 at 12:25 AM, Rajkumar <ra...@yahoo.co.in> wrote:
> Hi folks,
>
>
>
>
>
> I am trying to export data from HDFS to SQL Server. The original table has
> over 500 columns and every time I execute Sqoop export job it gets stuck
> showing mapreduce completed at 100%. I created two dummy tables as shown
> below to find out where the exact problem persists. The only difference
> between table1 and table2 is that the later has one additional column
> [col14 varchar(5)]
>
> First, I ran export job for Table1 which has 13 columns [datatype varchar
> (5)]. The job completed successfully and exported all the 3 records to the
> SQL Server.
>
> Next, I executed the export job for Table2 with 14 columns. When I ran
> this job, I didn’t see any error messages/exceptions, but it hangs forever
> after map completed at 100%. The SQL Server Activity Monitor shows a
> process is being created however it’s not receiving any data/prepared
> statement from Hadoop.
>
> Is this problem exists only with SQL Server? Is there any limitation on
> the number of columns exported to SQL Server? Please advise.
>
> *Configuration*
> Hadoop Version – Cloudera 2.6.0-CDH-5.5.2
> Sqoop Version – 1.4.6
> SQL Server Version – 2008 R2
>
> *Table 1 *
>
> CREATE TABLE [dbo].[tbldummy1](
> [col1] [varchar] (5) NOT NULL,
> [col2] [varchar](5) NULL,
> [col3] [varchar](5) NULL,
> [col4] [varchar](5) NULL,
> [col5] [varchar](5) NULL,
> [col6] [varchar](5) NULL,
> [col7] [varchar](5) NULL,
> [col8] [varchar](5) NULL,
> [col9] [varchar](5) NULL,
> [col10] [varchar](5) NULL,
> [col11] [varchar](5) NULL,
> [col12] [varchar](5) NULL,
> [col13] [varchar](5) NULL,
> CONSTRAINT [PK_dummy1] PRIMARY KEY ([col1] ASC))
>
> *Sqoop Command for Table 1*
>
> sqoop export \
> --connect “jdbc:sqlserver://x.x.x.x:port;database=xxxxxxx” \
> --username xxxxxx --password yyyyyy \
> --table tbldummy1 \
> --export-dir /user/hue/Out1 \
> --input-fields-terminated-by '|' \
> -m 1 \
> --verbose
>
> *Input data for Out1*
>
> aa|01|02|03|04|05|06|07|08|09|10|11|12
> bb|01|02|03|04|05|06|07|08|09|10|11|12
> cc|01|02|03|04|05|06|07|08|09|10|11|12
>
>
> *Table 2 *
>
> CREATE TABLE [dbo].[tbldummy2](
> [col1] [varchar] (5) NOT NULL,
> [col2] [varchar](5) NULL,
> [col3] [varchar](5) NULL,
> [col4] [varchar](5) NULL,
> [col5] [varchar](5) NULL,
> [col6] [varchar](5) NULL,
> [col7] [varchar](5) NULL,
> [col8] [varchar](5) NULL,
> [col9] [varchar](5) NULL,
> [col10] [varchar](5) NULL,
> [col11] [varchar](5) NULL,
> [col12] [varchar](5) NULL,
> [col13] [varchar](5) NULL,
> [col14] [varchar](5) NULL,
> CONSTRAINT [PK_dummy2] PRIMARY KEY ([col1] ASC))
>
> *Sqoop Command for Table 2*
>
> sqoop export \
> --connect "jdbc:sqlserver://x.x.x.x:port;database=xxxxxxx" \
> --username xxxxxx --password yyyyyy \
> --table tbldummy2 \
> --export-dir /user/hue/Out2 \
> --input-fields-terminated-by '|' \
> -m 1 \
> --verbose
>
> *Input data for Table 2*
>
> aa|01|02|03|04|05|06|07|08|09|10|11|12|13
> bb|01|02|03|04|05|06|07|08|09|10|11|12|13
> cc|01|02|03|04|05|06|07|08|09|10|11|12|13
>
> *Console logs for Table 2*
>
> 16/03/16 23:35:01 INFO mapreduce.Job: Running job: job_1458150283440_0028
> 16/03/16 23:35:07 INFO mapreduce.Job: Job job_1458150283440_0028 running
> in uber mode : false
> 16/03/16 23:35:07 INFO mapreduce.Job: map 0% reduce 0%
> 16/03/16 23:35:18 INFO mapreduce.Job: map 100% reduce 0%
>
> Thanks!
>
>
>
>
>