You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@sqoop.apache.org by David Langer <da...@hotmail.com> on 2012/01/20 20:51:21 UTC

The --hive-overwrite doesn't overwrite data

Greetings!
 
Hopefully this isn't too much of a newbie question, but I am unable to get the --hive-overwrite argument working. I'm using sqoop 1.3.0-cdh3u2 on the Cloudera VMWare Player VM.
 
 
The following sqoop invocation succeeds in creating the Hive table and populates it with data:
 
sqoop import --connect 'jdbc:mysql://localhost/MyDB?zeroDateTimeBehavior=round' --username cloudera --query 'SELECT *, 47 AS JobID FROM SalesPerson WHERE $CONDITIONS' --split-by ID  --target-dir /tmp/SalesPerson --create-hive-table --hive-import --hive-table MyDB_SalesPerson
 
 
However, while the following sqoop invocation does produce the desired data in HDFS (i.e., /tmp/SalesPerson) it does not overwrite the data in the Hive table:
 
sqoop import --connect 'jdbc:mysql://localhost/MyDB?zeroDateTimeBehavior=round' --username cloudera --query 'SELECT *, 87 AS JobID FROM SalesPerson WHERE $CONDITIONS' --split-by ID  --target-dir /tmp/SalesPerson --hive-overwrite --hive-table MyDB_salesperson
 
 
There is nothing in Hive.log that indicates the --hive-overwrite sqoop invocation is interacting with Hive (e.g., no exceptions).
 
Any assistance would be greatly appreciated.
 
Thanx,

Dave

Re: sqoop exporting from hdfs to postgresql, using --staging-table option

Posted by Weihua Zhu <wz...@adconion.com>.

Hi Kathleen,

  Thanks very much for the help...
  Regards,

  -Weihua


On Jan 23, 2012, at 5:26 PM, Kathleen Ting wrote:

Hi Weihua -

Unfortunately, the generic jdbc manager does not support staging.
As a result, I've filed https://issues.apache.org/jira/browse/SQOOP-431 on your behalf.

Regards, Kathleen


On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <wz...@adconion.com>> wrote:
Hi Guys,

  Good afternoon!
  I have a question. I was trying to sqoop exporting from hdfs to postgresql, using --staging-table options due to transactions consideration. But it gives me error below.
  I am wondering if the staging_able is supported for GenericJdbcManager? if not, what kind of manager should I use?
  Thanks very much!

 -Weihua

error message:

12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The active connection manager (org.apache.sqoop.manager.GenericJdbcManager) does not support staging of data for export. Please retry without specifying the --staging-table option.

Re: sqoop exporting from hdfs to postgresql, using --staging-table option

Posted by abhijeet gaikwad <ab...@gmail.com>.

As to what I understanf:
It is loading 1000 records because there are 1000 records in your input
file. We cannot relate this to batch size because firing a select count(*)
query will give you the total records inserted in the table - not the
batches in which they were inserted.

Cannot comment on the last statement (#3) in your mail below - specially in
context with Teradata. Open for discussion :)

Thanks,
Abhijeet Gaikwad

On Sat, Jan 28, 2012 at 11:28 PM, Srinivas Surasani <va...@gmail.com>wrote:

>
> Abhijeet --
>
> 1) By default it is loading 1000 records and as you mentioned earlier this
> can be tweaked using  "sqoop.export.records.per.statement ".
> 2) I  just run select count(*) on teradata table and seen 1000 records
> inserted at one go.
> 3) I believe  setting number of mappers > 1 works for only non-parallel
> databases.
>
> -- Srinivas
>
>
> On Sat, Jan 28, 2012 at 12:47 PM, abhijeet gaikwad <abygaikwad17@gmail.com
> > wrote:
>
>> Hi Srinivas,
>> Export with multiple mappers is allowed in SQOOP. I have exported data
>> into Sql Server as well as MySql using multiple mappers.
>>
>> Regarding the issue you are facing, I have few questions:
>> 1. How did you set the batch size, 1000 you are talking about.
>> 2. Can you share SQOOP logs in detail?
>> 3. Deadlock issue seems to have raised by Teradata. There is an equal
>> probability that a Teradata admin will be able to resolve this issue.
>>
>> Thanks,
>> Abhijeet Gaikwad
>> On 28 Jan 2012 22:16, "Srinivas Surasani" <va...@gmail.com> wrote:
>>
>>> Hi Abhijeet --,
>>>
>>>  Thanks for the information. I have one more question. Is the exports is
>>> done always with one mapper? ( entering into table deadlocks if number of
>>> mappers set to more than one ).
>>> Also, FYI: I have observed the default number of rows inserted is 1000.
>>>
>>> Thanks,
>>> -- Srinvas
>>>
>>> On Sat, Jan 28, 2012 at 10:12 AM, abhijeet gaikwad <
>>> abygaikwad17@gmail.com> wrote:
>>>
>>>> Hi Srinivas,
>>>> Haven't played with Teradata Connector, but in general there are two
>>>> properties that drive insertions (SQOOP Export) in a table - namely:
>>>>
>>>> 1. "sqoop.export.records.per.statement" : This property is used to
>>>> specify the number of records/rows to be inserted using a single
>>>> INSERT statement. Default value is 100.
>>>> 2. "sqoop.export.statements.per.transaction" : This property is used
>>>> to specify the number the insert statements before a commit is fired -
>>>> which you can call batch size. Default value is 100.
>>>>
>>>> You can use -D hadoop argument to specify these properties at command
>>>> line. E.g. -Dsqoop.export.statements.per.transaction=50
>>>>
>>>> NOTE: Make sure you use this argument(-D) before using any of the
>>>> SQOOP tool specific arguments. See SQOOP User Guide for more details.
>>>>
>>>> Thanks,
>>>> Abhijeet Gaikwad
>>>>
>>>> On 1/26/12, Srinivas Surasani <va...@gmail.com> wrote:
>>>> > Kathleen,
>>>> >
>>>> > Any information on below request.
>>>> >
>>>> > Hi All,
>>>> >
>>>> > I'm working on Hadoop CDH3 U0 and  Sqoop CDH3 U2.
>>>> >
>>>> > I'm trying to export csv files from HDFS to Teradata, it works well
>>>> with
>>>> > setting mapper to "1" ( with batch loading of 1000 records at a time
>>>> ).
>>>> > when I tried increasing the number of mappers to more than one I'm
>>>> getting
>>>> > the following error. Also, is it possible to configure batch size at
>>>> the
>>>> > time of export ( from the command line)??
>>>> >
>>>> >
>>>> >  sqoop export  --verbose --driver com.teradata.jdbc.TeraDriver
>>>> > --connect jdbc:teradata://xxxx/database=xxxx  --username xxxxx
>>>> --password
>>>> > xxxxx --table xxxx --export-dir /user/surasani/10minutes.txt
>>>> > --fields-terminated-by '|' -m 4 --batch
>>>> >
>>>> > 12/01/24 16:17:21 INFO mapred.JobClient:  map 3% reduce 0%
>>>> > 12/01/24 16:17:48 INFO mapred.JobClient: Task Id :
>>>> > attempt_201112211106_68553_m_000001_2, Status : FAILED
>>>> > *java.io.IOException: java.sql.BatchUpdateException: [Teradata
>>>> Database]
>>>> > [TeraJDBC 13.00.00.07] [Error 2631] [SQLState 40001] Transaction
>>>> ABORTed
>>>> > due to DeadLock*.
>>>> >
>>>> > Srinivas --
>>>> >
>>>> > On Wed, Jan 25, 2012 at 8:01 PM, Kathleen Ting <kathleen@cloudera.com
>>>> >wrote:
>>>> >
>>>> >> Srinivas, as it happens, the Cloudera Connector for Teradata supports
>>>> >> staging tables. It is freely available here:
>>>> >>
>>>> https://ccp.cloudera.com/display/con/Cloudera+Connector+for+Teradata+Download
>>>> >> .
>>>> >>
>>>> >> Regards, Kathleen
>>>> >>
>>>> >> On Wed, Jan 25, 2012 at 3:36 PM, Srinivas Surasani
>>>> >> <va...@gmail.com>wrote:
>>>> >>
>>>> >>> Hi Kathleen,
>>>> >>>
>>>> >>> Same issue with Teradata.
>>>> >>>
>>>> >>>
>>>> >>> Srinivas --
>>>> >>>
>>>> >>>
>>>> >>> On Mon, Jan 23, 2012 at 8:26 PM, Kathleen Ting
>>>> >>> <ka...@cloudera.com>wrote:
>>>> >>>
>>>> >>>> Hi Weihua -
>>>> >>>>
>>>> >>>> Unfortunately, the generic jdbc manager does not support staging.
>>>> >>>> As a result, I've filed
>>>> >>>> https://issues.apache.org/jira/browse/SQOOP-431on your behalf.
>>>> >>>>
>>>> >>>> Regards, Kathleen
>>>> >>>>
>>>> >>>>
>>>> >>>> On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <wz...@adconion.com>
>>>> wrote:
>>>> >>>>
>>>> >>>>> Hi Guys,
>>>> >>>>>
>>>> >>>>>   Good afternoon!
>>>> >>>>>   I have a question. I was trying to sqoop exporting from hdfs to
>>>> >>>>> postgresql, using --staging-table options due to transactions
>>>> >>>>> consideration. But it gives me error below.
>>>> >>>>>   I am wondering if the staging_able is supported for
>>>> >>>>> GenericJdbcManager? if not, what kind of manager should I use?
>>>> >>>>>   Thanks very much!
>>>> >>>>>
>>>> >>>>>  -Weihua
>>>> >>>>>
>>>> >>>>> error message:
>>>> >>>>>
>>>> >>>>> 12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The
>>>> >>>>> active connection manager
>>>> (org.apache.sqoop.manager.GenericJdbcManager)
>>>> >>>>> does not support staging of data for export. Please retry without
>>>> >>>>> specifying the --staging-table option.
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>
>>>> >>>
>>>> >>
>>>> >
>>>>
>>>
>>>
>

Re: sqoop exporting from hdfs to postgresql, using --staging-table option

Posted by Srinivas Surasani <va...@gmail.com>.

Abhijeet --

1) By default it is loading 1000 records and as you mentioned earlier this
can be tweaked using  "sqoop.export.records.per.statement ".
2) I  just run select count(*) on teradata table and seen 1000 records
inserted at one go.
3) I believe  setting number of mappers > 1 works for only non-parallel
databases.

-- Srinivas

On Sat, Jan 28, 2012 at 12:47 PM, abhijeet gaikwad
<ab...@gmail.com>wrote:

> Hi Srinivas,
> Export with multiple mappers is allowed in SQOOP. I have exported data
> into Sql Server as well as MySql using multiple mappers.
>
> Regarding the issue you are facing, I have few questions:
> 1. How did you set the batch size, 1000 you are talking about.
> 2. Can you share SQOOP logs in detail?
> 3. Deadlock issue seems to have raised by Teradata. There is an equal
> probability that a Teradata admin will be able to resolve this issue.
>
> Thanks,
> Abhijeet Gaikwad
> On 28 Jan 2012 22:16, "Srinivas Surasani" <va...@gmail.com> wrote:
>
>> Hi Abhijeet --,
>>
>>  Thanks for the information. I have one more question. Is the exports is
>> done always with one mapper? ( entering into table deadlocks if number of
>> mappers set to more than one ).
>> Also, FYI: I have observed the default number of rows inserted is 1000.
>>
>> Thanks,
>> -- Srinvas
>>
>> On Sat, Jan 28, 2012 at 10:12 AM, abhijeet gaikwad <
>> abygaikwad17@gmail.com> wrote:
>>
>>> Hi Srinivas,
>>> Haven't played with Teradata Connector, but in general there are two
>>> properties that drive insertions (SQOOP Export) in a table - namely:
>>>
>>> 1. "sqoop.export.records.per.statement" : This property is used to
>>> specify the number of records/rows to be inserted using a single
>>> INSERT statement. Default value is 100.
>>> 2. "sqoop.export.statements.per.transaction" : This property is used
>>> to specify the number the insert statements before a commit is fired -
>>> which you can call batch size. Default value is 100.
>>>
>>> You can use -D hadoop argument to specify these properties at command
>>> line. E.g. -Dsqoop.export.statements.per.transaction=50
>>>
>>> NOTE: Make sure you use this argument(-D) before using any of the
>>> SQOOP tool specific arguments. See SQOOP User Guide for more details.
>>>
>>> Thanks,
>>> Abhijeet Gaikwad
>>>
>>> On 1/26/12, Srinivas Surasani <va...@gmail.com> wrote:
>>> > Kathleen,
>>> >
>>> > Any information on below request.
>>> >
>>> > Hi All,
>>> >
>>> > I'm working on Hadoop CDH3 U0 and  Sqoop CDH3 U2.
>>> >
>>> > I'm trying to export csv files from HDFS to Teradata, it works well
>>> with
>>> > setting mapper to "1" ( with batch loading of 1000 records at a time ).
>>> > when I tried increasing the number of mappers to more than one I'm
>>> getting
>>> > the following error. Also, is it possible to configure batch size at
>>> the
>>> > time of export ( from the command line)??
>>> >
>>> >
>>> >  sqoop export  --verbose --driver com.teradata.jdbc.TeraDriver
>>> > --connect jdbc:teradata://xxxx/database=xxxx  --username xxxxx
>>> --password
>>> > xxxxx --table xxxx --export-dir /user/surasani/10minutes.txt
>>> > --fields-terminated-by '|' -m 4 --batch
>>> >
>>> > 12/01/24 16:17:21 INFO mapred.JobClient:  map 3% reduce 0%
>>> > 12/01/24 16:17:48 INFO mapred.JobClient: Task Id :
>>> > attempt_201112211106_68553_m_000001_2, Status : FAILED
>>> > *java.io.IOException: java.sql.BatchUpdateException: [Teradata
>>> Database]
>>> > [TeraJDBC 13.00.00.07] [Error 2631] [SQLState 40001] Transaction
>>> ABORTed
>>> > due to DeadLock*.
>>> >
>>> > Srinivas --
>>> >
>>> > On Wed, Jan 25, 2012 at 8:01 PM, Kathleen Ting <kathleen@cloudera.com
>>> >wrote:
>>> >
>>> >> Srinivas, as it happens, the Cloudera Connector for Teradata supports
>>> >> staging tables. It is freely available here:
>>> >>
>>> https://ccp.cloudera.com/display/con/Cloudera+Connector+for+Teradata+Download
>>> >> .
>>> >>
>>> >> Regards, Kathleen
>>> >>
>>> >> On Wed, Jan 25, 2012 at 3:36 PM, Srinivas Surasani
>>> >> <va...@gmail.com>wrote:
>>> >>
>>> >>> Hi Kathleen,
>>> >>>
>>> >>> Same issue with Teradata.
>>> >>>
>>> >>>
>>> >>> Srinivas --
>>> >>>
>>> >>>
>>> >>> On Mon, Jan 23, 2012 at 8:26 PM, Kathleen Ting
>>> >>> <ka...@cloudera.com>wrote:
>>> >>>
>>> >>>> Hi Weihua -
>>> >>>>
>>> >>>> Unfortunately, the generic jdbc manager does not support staging.
>>> >>>> As a result, I've filed
>>> >>>> https://issues.apache.org/jira/browse/SQOOP-431on your behalf.
>>> >>>>
>>> >>>> Regards, Kathleen
>>> >>>>
>>> >>>>
>>> >>>> On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <wz...@adconion.com>
>>> wrote:
>>> >>>>
>>> >>>>> Hi Guys,
>>> >>>>>
>>> >>>>>   Good afternoon!
>>> >>>>>   I have a question. I was trying to sqoop exporting from hdfs to
>>> >>>>> postgresql, using --staging-table options due to transactions
>>> >>>>> consideration. But it gives me error below.
>>> >>>>>   I am wondering if the staging_able is supported for
>>> >>>>> GenericJdbcManager? if not, what kind of manager should I use?
>>> >>>>>   Thanks very much!
>>> >>>>>
>>> >>>>>  -Weihua
>>> >>>>>
>>> >>>>> error message:
>>> >>>>>
>>> >>>>> 12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The
>>> >>>>> active connection manager
>>> (org.apache.sqoop.manager.GenericJdbcManager)
>>> >>>>> does not support staging of data for export. Please retry without
>>> >>>>> specifying the --staging-table option.
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>
>>> >>>
>>> >>
>>> >
>>>
>>
>>

Re: sqoop exporting from hdfs to postgresql, using --staging-table option

Posted by abhijeet gaikwad <ab...@gmail.com>.

Hi Srinivas,
Export with multiple mappers is allowed in SQOOP. I have exported data into
Sql Server as well as MySql using multiple mappers.

Regarding the issue you are facing, I have few questions:
1. How did you set the batch size, 1000 you are talking about.
2. Can you share SQOOP logs in detail?
3. Deadlock issue seems to have raised by Teradata. There is an equal
probability that a Teradata admin will be able to resolve this issue.

Thanks,
Abhijeet Gaikwad
On 28 Jan 2012 22:16, "Srinivas Surasani" <va...@gmail.com> wrote:

> Hi Abhijeet --,
>
>  Thanks for the information. I have one more question. Is the exports is
> done always with one mapper? ( entering into table deadlocks if number of
> mappers set to more than one ).
> Also, FYI: I have observed the default number of rows inserted is 1000.
>
> Thanks,
> -- Srinvas
>
> On Sat, Jan 28, 2012 at 10:12 AM, abhijeet gaikwad <abygaikwad17@gmail.com
> > wrote:
>
>> Hi Srinivas,
>> Haven't played with Teradata Connector, but in general there are two
>> properties that drive insertions (SQOOP Export) in a table - namely:
>>
>> 1. "sqoop.export.records.per.statement" : This property is used to
>> specify the number of records/rows to be inserted using a single
>> INSERT statement. Default value is 100.
>> 2. "sqoop.export.statements.per.transaction" : This property is used
>> to specify the number the insert statements before a commit is fired -
>> which you can call batch size. Default value is 100.
>>
>> You can use -D hadoop argument to specify these properties at command
>> line. E.g. -Dsqoop.export.statements.per.transaction=50
>>
>> NOTE: Make sure you use this argument(-D) before using any of the
>> SQOOP tool specific arguments. See SQOOP User Guide for more details.
>>
>> Thanks,
>> Abhijeet Gaikwad
>>
>> On 1/26/12, Srinivas Surasani <va...@gmail.com> wrote:
>> > Kathleen,
>> >
>> > Any information on below request.
>> >
>> > Hi All,
>> >
>> > I'm working on Hadoop CDH3 U0 and  Sqoop CDH3 U2.
>> >
>> > I'm trying to export csv files from HDFS to Teradata, it works well with
>> > setting mapper to "1" ( with batch loading of 1000 records at a time ).
>> > when I tried increasing the number of mappers to more than one I'm
>> getting
>> > the following error. Also, is it possible to configure batch size at the
>> > time of export ( from the command line)??
>> >
>> >
>> >  sqoop export  --verbose --driver com.teradata.jdbc.TeraDriver
>> > --connect jdbc:teradata://xxxx/database=xxxx  --username xxxxx
>> --password
>> > xxxxx --table xxxx --export-dir /user/surasani/10minutes.txt
>> > --fields-terminated-by '|' -m 4 --batch
>> >
>> > 12/01/24 16:17:21 INFO mapred.JobClient:  map 3% reduce 0%
>> > 12/01/24 16:17:48 INFO mapred.JobClient: Task Id :
>> > attempt_201112211106_68553_m_000001_2, Status : FAILED
>> > *java.io.IOException: java.sql.BatchUpdateException: [Teradata Database]
>> > [TeraJDBC 13.00.00.07] [Error 2631] [SQLState 40001] Transaction ABORTed
>> > due to DeadLock*.
>> >
>> > Srinivas --
>> >
>> > On Wed, Jan 25, 2012 at 8:01 PM, Kathleen Ting <kathleen@cloudera.com
>> >wrote:
>> >
>> >> Srinivas, as it happens, the Cloudera Connector for Teradata supports
>> >> staging tables. It is freely available here:
>> >>
>> https://ccp.cloudera.com/display/con/Cloudera+Connector+for+Teradata+Download
>> >> .
>> >>
>> >> Regards, Kathleen
>> >>
>> >> On Wed, Jan 25, 2012 at 3:36 PM, Srinivas Surasani
>> >> <va...@gmail.com>wrote:
>> >>
>> >>> Hi Kathleen,
>> >>>
>> >>> Same issue with Teradata.
>> >>>
>> >>>
>> >>> Srinivas --
>> >>>
>> >>>
>> >>> On Mon, Jan 23, 2012 at 8:26 PM, Kathleen Ting
>> >>> <ka...@cloudera.com>wrote:
>> >>>
>> >>>> Hi Weihua -
>> >>>>
>> >>>> Unfortunately, the generic jdbc manager does not support staging.
>> >>>> As a result, I've filed
>> >>>> https://issues.apache.org/jira/browse/SQOOP-431on your behalf.
>> >>>>
>> >>>> Regards, Kathleen
>> >>>>
>> >>>>
>> >>>> On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <wz...@adconion.com>
>> wrote:
>> >>>>
>> >>>>> Hi Guys,
>> >>>>>
>> >>>>>   Good afternoon!
>> >>>>>   I have a question. I was trying to sqoop exporting from hdfs to
>> >>>>> postgresql, using --staging-table options due to transactions
>> >>>>> consideration. But it gives me error below.
>> >>>>>   I am wondering if the staging_able is supported for
>> >>>>> GenericJdbcManager? if not, what kind of manager should I use?
>> >>>>>   Thanks very much!
>> >>>>>
>> >>>>>  -Weihua
>> >>>>>
>> >>>>> error message:
>> >>>>>
>> >>>>> 12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The
>> >>>>> active connection manager
>> (org.apache.sqoop.manager.GenericJdbcManager)
>> >>>>> does not support staging of data for export. Please retry without
>> >>>>> specifying the --staging-table option.
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> >
>>
>
>

Re: sqoop exporting from hdfs to postgresql, using --staging-table option

Posted by Srinivas Surasani <va...@gmail.com>.

Hi Abhijeet --,

 Thanks for the information. I have one more question. Is the exports is
done always with one mapper? ( entering into table deadlocks if number of
mappers set to more than one ).
Also, FYI: I have observed the default number of rows inserted is 1000.

Thanks,
-- Srinvas

On Sat, Jan 28, 2012 at 10:12 AM, abhijeet gaikwad
<ab...@gmail.com>wrote:

> Hi Srinivas,
> Haven't played with Teradata Connector, but in general there are two
> properties that drive insertions (SQOOP Export) in a table - namely:
>
> 1. "sqoop.export.records.per.statement" : This property is used to
> specify the number of records/rows to be inserted using a single
> INSERT statement. Default value is 100.
> 2. "sqoop.export.statements.per.transaction" : This property is used
> to specify the number the insert statements before a commit is fired -
> which you can call batch size. Default value is 100.
>
> You can use -D hadoop argument to specify these properties at command
> line. E.g. -Dsqoop.export.statements.per.transaction=50
>
> NOTE: Make sure you use this argument(-D) before using any of the
> SQOOP tool specific arguments. See SQOOP User Guide for more details.
>
> Thanks,
> Abhijeet Gaikwad
>
> On 1/26/12, Srinivas Surasani <va...@gmail.com> wrote:
> > Kathleen,
> >
> > Any information on below request.
> >
> > Hi All,
> >
> > I'm working on Hadoop CDH3 U0 and  Sqoop CDH3 U2.
> >
> > I'm trying to export csv files from HDFS to Teradata, it works well with
> > setting mapper to "1" ( with batch loading of 1000 records at a time ).
> > when I tried increasing the number of mappers to more than one I'm
> getting
> > the following error. Also, is it possible to configure batch size at the
> > time of export ( from the command line)??
> >
> >
> >  sqoop export  --verbose --driver com.teradata.jdbc.TeraDriver
> > --connect jdbc:teradata://xxxx/database=xxxx  --username xxxxx --password
> > xxxxx --table xxxx --export-dir /user/surasani/10minutes.txt
> > --fields-terminated-by '|' -m 4 --batch
> >
> > 12/01/24 16:17:21 INFO mapred.JobClient:  map 3% reduce 0%
> > 12/01/24 16:17:48 INFO mapred.JobClient: Task Id :
> > attempt_201112211106_68553_m_000001_2, Status : FAILED
> > *java.io.IOException: java.sql.BatchUpdateException: [Teradata Database]
> > [TeraJDBC 13.00.00.07] [Error 2631] [SQLState 40001] Transaction ABORTed
> > due to DeadLock*.
> >
> > Srinivas --
> >
> > On Wed, Jan 25, 2012 at 8:01 PM, Kathleen Ting <kathleen@cloudera.com
> >wrote:
> >
> >> Srinivas, as it happens, the Cloudera Connector for Teradata supports
> >> staging tables. It is freely available here:
> >>
> https://ccp.cloudera.com/display/con/Cloudera+Connector+for+Teradata+Download
> >> .
> >>
> >> Regards, Kathleen
> >>
> >> On Wed, Jan 25, 2012 at 3:36 PM, Srinivas Surasani
> >> <va...@gmail.com>wrote:
> >>
> >>> Hi Kathleen,
> >>>
> >>> Same issue with Teradata.
> >>>
> >>>
> >>> Srinivas --
> >>>
> >>>
> >>> On Mon, Jan 23, 2012 at 8:26 PM, Kathleen Ting
> >>> <ka...@cloudera.com>wrote:
> >>>
> >>>> Hi Weihua -
> >>>>
> >>>> Unfortunately, the generic jdbc manager does not support staging.
> >>>> As a result, I've filed
> >>>> https://issues.apache.org/jira/browse/SQOOP-431on your behalf.
> >>>>
> >>>> Regards, Kathleen
> >>>>
> >>>>
> >>>> On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <wz...@adconion.com>
> wrote:
> >>>>
> >>>>> Hi Guys,
> >>>>>
> >>>>>   Good afternoon!
> >>>>>   I have a question. I was trying to sqoop exporting from hdfs to
> >>>>> postgresql, using --staging-table options due to transactions
> >>>>> consideration. But it gives me error below.
> >>>>>   I am wondering if the staging_able is supported for
> >>>>> GenericJdbcManager? if not, what kind of manager should I use?
> >>>>>   Thanks very much!
> >>>>>
> >>>>>  -Weihua
> >>>>>
> >>>>> error message:
> >>>>>
> >>>>> 12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The
> >>>>> active connection manager
> (org.apache.sqoop.manager.GenericJdbcManager)
> >>>>> does not support staging of data for export. Please retry without
> >>>>> specifying the --staging-table option.
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>

Re: sqoop exporting from hdfs to postgresql, using --staging-table option

Posted by abhijeet gaikwad <ab...@gmail.com>.

Hi Srinivas,
Haven't played with Teradata Connector, but in general there are two
properties that drive insertions (SQOOP Export) in a table - namely:

1. "sqoop.export.records.per.statement" : This property is used to
specify the number of records/rows to be inserted using a single
INSERT statement. Default value is 100.
2. "sqoop.export.statements.per.transaction" : This property is used
to specify the number the insert statements before a commit is fired -
which you can call batch size. Default value is 100.

You can use -D hadoop argument to specify these properties at command
line. E.g. -Dsqoop.export.statements.per.transaction=50

NOTE: Make sure you use this argument(-D) before using any of the
SQOOP tool specific arguments. See SQOOP User Guide for more details.

Thanks,
Abhijeet Gaikwad

On 1/26/12, Srinivas Surasani <va...@gmail.com> wrote:
> Kathleen,
>
> Any information on below request.
>
> Hi All,
>
> I'm working on Hadoop CDH3 U0 and  Sqoop CDH3 U2.
>
> I'm trying to export csv files from HDFS to Teradata, it works well with
> setting mapper to "1" ( with batch loading of 1000 records at a time ).
> when I tried increasing the number of mappers to more than one I'm getting
> the following error. Also, is it possible to configure batch size at the
> time of export ( from the command line)??
>
>
>  sqoop export  --verbose --driver com.teradata.jdbc.TeraDriver
> --connect jdbc:teradata://xxxx/database=xxxx  --username xxxxx --password
> xxxxx --table xxxx --export-dir /user/surasani/10minutes.txt
> --fields-terminated-by '|' -m 4 --batch
>
> 12/01/24 16:17:21 INFO mapred.JobClient:  map 3% reduce 0%
> 12/01/24 16:17:48 INFO mapred.JobClient: Task Id :
> attempt_201112211106_68553_m_000001_2, Status : FAILED
> *java.io.IOException: java.sql.BatchUpdateException: [Teradata Database]
> [TeraJDBC 13.00.00.07] [Error 2631] [SQLState 40001] Transaction ABORTed
> due to DeadLock*.
>
> Srinivas --
>
> On Wed, Jan 25, 2012 at 8:01 PM, Kathleen Ting <ka...@cloudera.com>wrote:
>
>> Srinivas, as it happens, the Cloudera Connector for Teradata supports
>> staging tables. It is freely available here:
>> https://ccp.cloudera.com/display/con/Cloudera+Connector+for+Teradata+Download
>> .
>>
>> Regards, Kathleen
>>
>> On Wed, Jan 25, 2012 at 3:36 PM, Srinivas Surasani
>> <va...@gmail.com>wrote:
>>
>>> Hi Kathleen,
>>>
>>> Same issue with Teradata.
>>>
>>>
>>> Srinivas --
>>>
>>>
>>> On Mon, Jan 23, 2012 at 8:26 PM, Kathleen Ting
>>> <ka...@cloudera.com>wrote:
>>>
>>>> Hi Weihua -
>>>>
>>>> Unfortunately, the generic jdbc manager does not support staging.
>>>> As a result, I've filed
>>>> https://issues.apache.org/jira/browse/SQOOP-431on your behalf.
>>>>
>>>> Regards, Kathleen
>>>>
>>>>
>>>> On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <wz...@adconion.com> wrote:
>>>>
>>>>> Hi Guys,
>>>>>
>>>>>   Good afternoon!
>>>>>   I have a question. I was trying to sqoop exporting from hdfs to
>>>>> postgresql, using --staging-table options due to transactions
>>>>> consideration. But it gives me error below.
>>>>>   I am wondering if the staging_able is supported for
>>>>> GenericJdbcManager? if not, what kind of manager should I use?
>>>>>   Thanks very much!
>>>>>
>>>>>  -Weihua
>>>>>
>>>>> error message:
>>>>>
>>>>> 12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The
>>>>> active connection manager (org.apache.sqoop.manager.GenericJdbcManager)
>>>>> does not support staging of data for export. Please retry without
>>>>> specifying the --staging-table option.
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: sqoop exporting from hdfs to postgresql, using --staging-table option

Posted by Srinivas Surasani <va...@gmail.com>.

Kathleen,

Any information on below request.

Hi All,

I'm working on Hadoop CDH3 U0 and  Sqoop CDH3 U2.

I'm trying to export csv files from HDFS to Teradata, it works well with
setting mapper to "1" ( with batch loading of 1000 records at a time ).
when I tried increasing the number of mappers to more than one I'm getting
the following error. Also, is it possible to configure batch size at the
time of export ( from the command line)??


 sqoop export  --verbose --driver com.teradata.jdbc.TeraDriver
--connect jdbc:teradata://xxxx/database=xxxx  --username xxxxx --password
xxxxx --table xxxx --export-dir /user/surasani/10minutes.txt
--fields-terminated-by '|' -m 4 --batch

12/01/24 16:17:21 INFO mapred.JobClient:  map 3% reduce 0%
12/01/24 16:17:48 INFO mapred.JobClient: Task Id :
attempt_201112211106_68553_m_000001_2, Status : FAILED
*java.io.IOException: java.sql.BatchUpdateException: [Teradata Database]
[TeraJDBC 13.00.00.07] [Error 2631] [SQLState 40001] Transaction ABORTed
due to DeadLock*.

Srinivas --

On Wed, Jan 25, 2012 at 8:01 PM, Kathleen Ting <ka...@cloudera.com>wrote:

> Srinivas, as it happens, the Cloudera Connector for Teradata supports
> staging tables. It is freely available here:
> https://ccp.cloudera.com/display/con/Cloudera+Connector+for+Teradata+Download
> .
>
> Regards, Kathleen
>
> On Wed, Jan 25, 2012 at 3:36 PM, Srinivas Surasani <va...@gmail.com>wrote:
>
>> Hi Kathleen,
>>
>> Same issue with Teradata.
>>
>>
>> Srinivas --
>>
>>
>> On Mon, Jan 23, 2012 at 8:26 PM, Kathleen Ting <ka...@cloudera.com>wrote:
>>
>>> Hi Weihua -
>>>
>>> Unfortunately, the generic jdbc manager does not support staging.
>>> As a result, I've filed https://issues.apache.org/jira/browse/SQOOP-431on your behalf.
>>>
>>> Regards, Kathleen
>>>
>>>
>>> On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <wz...@adconion.com> wrote:
>>>
>>>> Hi Guys,
>>>>
>>>>   Good afternoon!
>>>>   I have a question. I was trying to sqoop exporting from hdfs to
>>>> postgresql, using --staging-table options due to transactions
>>>> consideration. But it gives me error below.
>>>>   I am wondering if the staging_able is supported for
>>>> GenericJdbcManager? if not, what kind of manager should I use?
>>>>   Thanks very much!
>>>>
>>>>  -Weihua
>>>>
>>>> error message:
>>>>
>>>> 12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The
>>>> active connection manager (org.apache.sqoop.manager.GenericJdbcManager)
>>>> does not support staging of data for export. Please retry without
>>>> specifying the --staging-table option.
>>>>
>>>>
>>>>
>>>
>>
>

Re: sqoop exporting from hdfs to postgresql, using --staging-table option

Posted by Srinivas Surasani <va...@gmail.com>.

Yes,  I looked into it.

 Thanks for the info.

Regards,
Srinivas --


On Wed, Jan 25, 2012 at 8:01 PM, Kathleen Ting <ka...@cloudera.com>wrote:

> Srinivas, as it happens, the Cloudera Connector for Teradata supports
> staging tables. It is freely available here:
> https://ccp.cloudera.com/display/con/Cloudera+Connector+for+Teradata+Download
> .
>
> Regards, Kathleen
>
> On Wed, Jan 25, 2012 at 3:36 PM, Srinivas Surasani <va...@gmail.com>wrote:
>
>> Hi Kathleen,
>>
>> Same issue with Teradata.
>>
>>
>> Srinivas --
>>
>>
>> On Mon, Jan 23, 2012 at 8:26 PM, Kathleen Ting <ka...@cloudera.com>wrote:
>>
>>> Hi Weihua -
>>>
>>> Unfortunately, the generic jdbc manager does not support staging.
>>> As a result, I've filed https://issues.apache.org/jira/browse/SQOOP-431on your behalf.
>>>
>>> Regards, Kathleen
>>>
>>>
>>> On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <wz...@adconion.com> wrote:
>>>
>>>> Hi Guys,
>>>>
>>>>   Good afternoon!
>>>>   I have a question. I was trying to sqoop exporting from hdfs to
>>>> postgresql, using --staging-table options due to transactions
>>>> consideration. But it gives me error below.
>>>>   I am wondering if the staging_able is supported for
>>>> GenericJdbcManager? if not, what kind of manager should I use?
>>>>   Thanks very much!
>>>>
>>>>  -Weihua
>>>>
>>>> error message:
>>>>
>>>> 12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The
>>>> active connection manager (org.apache.sqoop.manager.GenericJdbcManager)
>>>> does not support staging of data for export. Please retry without
>>>> specifying the --staging-table option.
>>>>
>>>>
>>>>
>>>
>>
>

Re: sqoop exporting from hdfs to postgresql, using --staging-table option

Posted by Kathleen Ting <ka...@cloudera.com>.

Srinivas, as it happens, the Cloudera Connector for Teradata supports
staging tables. It is freely available here:
https://ccp.cloudera.com/display/con/Cloudera+Connector+for+Teradata+Download
.

Regards, Kathleen

On Wed, Jan 25, 2012 at 3:36 PM, Srinivas Surasani <va...@gmail.com> wrote:

> Hi Kathleen,
>
> Same issue with Teradata.
>
>
> Srinivas --
>
>
> On Mon, Jan 23, 2012 at 8:26 PM, Kathleen Ting <ka...@cloudera.com>wrote:
>
>> Hi Weihua -
>>
>> Unfortunately, the generic jdbc manager does not support staging.
>> As a result, I've filed https://issues.apache.org/jira/browse/SQOOP-431on your behalf.
>>
>> Regards, Kathleen
>>
>>
>> On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <wz...@adconion.com> wrote:
>>
>>> Hi Guys,
>>>
>>>   Good afternoon!
>>>   I have a question. I was trying to sqoop exporting from hdfs to
>>> postgresql, using --staging-table options due to transactions
>>> consideration. But it gives me error below.
>>>   I am wondering if the staging_able is supported for
>>> GenericJdbcManager? if not, what kind of manager should I use?
>>>   Thanks very much!
>>>
>>>  -Weihua
>>>
>>> error message:
>>>
>>> 12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The active
>>> connection manager (org.apache.sqoop.manager.GenericJdbcManager) does not
>>> support staging of data for export. Please retry without specifying the
>>> --staging-table option.
>>>
>>>
>>>
>>
>

Re: sqoop exporting from hdfs to postgresql, using --staging-table option

Posted by Srinivas Surasani <va...@gmail.com>.

Hi Kathleen,

Same issue with Teradata.


Srinivas --

On Mon, Jan 23, 2012 at 8:26 PM, Kathleen Ting <ka...@cloudera.com>wrote:

> Hi Weihua -
>
> Unfortunately, the generic jdbc manager does not support staging.
> As a result, I've filed https://issues.apache.org/jira/browse/SQOOP-431on your behalf.
>
> Regards, Kathleen
>
>
> On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <wz...@adconion.com> wrote:
>
>> Hi Guys,
>>
>>   Good afternoon!
>>   I have a question. I was trying to sqoop exporting from hdfs to
>> postgresql, using --staging-table options due to transactions
>> consideration. But it gives me error below.
>>   I am wondering if the staging_able is supported for GenericJdbcManager?
>> if not, what kind of manager should I use?
>>   Thanks very much!
>>
>>  -Weihua
>>
>> error message:
>>
>> 12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The active
>> connection manager (org.apache.sqoop.manager.GenericJdbcManager) does not
>> support staging of data for export. Please retry without specifying the
>> --staging-table option.
>>
>>
>>
>

Re: sqoop exporting from hdfs to postgresql, using --staging-table option

Posted by Kathleen Ting <ka...@cloudera.com>.

Hi Weihua -

Unfortunately, the generic jdbc manager does not support staging.
As a result, I've filed https://issues.apache.org/jira/browse/SQOOP-431 on
your behalf.

Regards, Kathleen


On Mon, Jan 23, 2012 at 3:10 PM, Weihua Zhu <wz...@adconion.com> wrote:

> Hi Guys,
>
>   Good afternoon!
>   I have a question. I was trying to sqoop exporting from hdfs to
> postgresql, using --staging-table options due to transactions
> consideration. But it gives me error below.
>   I am wondering if the staging_able is supported for GenericJdbcManager?
> if not, what kind of manager should I use?
>   Thanks very much!
>
>  -Weihua
>
> error message:
>
> 12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The active
> connection manager (org.apache.sqoop.manager.GenericJdbcManager) does not
> support staging of data for export. Please retry without specifying the
> --staging-table option.
>
>
>

sqoop exporting from hdfs to postgresql, using --staging-table option

Posted by Weihua Zhu <wz...@adconion.com>.

Hi Guys,

   Good afternoon!
   I have a question. I was trying to sqoop exporting from hdfs to postgresql, using --staging-table options due to transactions consideration. But it gives me error below. 
   I am wondering if the staging_able is supported for GenericJdbcManager? if not, what kind of manager should I use? 
   Thanks very much!

  -Weihua

error message: 

12/01/23 15:00:39 ERROR tool.ExportTool: Error during export: The active connection manager (org.apache.sqoop.manager.GenericJdbcManager) does not support staging of data for export. Please retry without specifying the --staging-table option.

RE: The --hive-overwrite doesn't overwrite data

Posted by David Langer <da...@hotmail.com>.

Duh, that took care of it!
 
Thanx for the help!

 

> From: kathleen@cloudera.com
> Date: Fri, 20 Jan 2012 15:58:47 -0800
> Subject: Re: The --hive-overwrite doesn't overwrite data
> To: sqoop-user@incubator.apache.org
> 
> Dave - can you try adding the --hive-import option?
> 
> Regards, Kathleen
> 
> On Fri, Jan 20, 2012 at 3:07 PM, David Langer <da...@hotmail.com> wrote:
> > Sure. Here it is:
> >
> > [cloudera@localhost ~]$ hive;
> > Hive history
> > file=/tmp/cloudera/hive_job_log_cloudera_201201201806_30238324.txt
> > hive> show tables;
> > OK
> > ndw_adventureworks_salesperson
> > Time taken: 3.716 seconds
> > hive> quit;
> > [cloudera@localhost ~]$ sqoop import --connect
> > 'jdbc:mysql://localhost/AdventureWorks?zeroDateTimeBehavior=round'
> > --username cloudera --query 'SELECT *, 87 AS JobID FROM SalesPerson WHERE
> > $CONDITIONS' --split-by BusinessEntityID  --target-dir /tmp/SalesPerson
> > --hive-overwrite --hive-table NDW_AdventureWorks_SalesPerson --verbose
> > 12/01/20 18:02:34 DEBUG tool.BaseSqoopTool: Enabled debug logging.
> > 12/01/20 18:02:34 DEBUG sqoop.ConnFactory: Added factory
> > com.microsoft.sqoop.SqlServer.MSSQLServerManagerFactory specified by
> > /usr/lib/sqoop/conf/managers.d/mssqoop-sqlserver
> > 12/01/20 18:02:34 DEBUG sqoop.ConnFactory: Loaded manager factory:
> > com.microsoft.sqoop.SqlServer.MSSQLServerManagerFactory
> > 12/01/20 18:02:34 DEBUG sqoop.ConnFactory: Loaded manager factory:
> > com.cloudera.sqoop.manager.DefaultManagerFactory
> > 12/01/20 18:02:34 DEBUG sqoop.ConnFactory: Trying ManagerFactory:
> > com.microsoft.sqoop.SqlServer.MSSQLServerManagerFactory
> > 12/01/20 18:02:34 DEBUG sqoop.ConnFactory: Trying ManagerFactory:
> > com.cloudera.sqoop.manager.DefaultManagerFactory
> > 12/01/20 18:02:34 DEBUG manager.DefaultManagerFactory: Trying with scheme:
> > jdbc:mysql:
> > 12/01/20 18:02:34 INFO manager.MySQLManager: Preparing to use a MySQL
> > streaming resultset.
> > 12/01/20 18:02:34 DEBUG sqoop.ConnFactory: Instantiated ConnManager
> > com.cloudera.sqoop.manager.MySQLManager@303020ad
> > 12/01/20 18:02:34 INFO tool.CodeGenTool: Beginning code generation
> > 12/01/20 18:02:35 DEBUG manager.SqlManager: No connection paramenters
> > specified. Using regular API for making connection.
> > 12/01/20 18:02:35 DEBUG manager.SqlManager: Using fetchSize for next query:
> > -2147483648
> > 12/01/20 18:02:35 INFO manager.SqlManager: Executing SQL statement: SELECT
> > *, 87 AS JobID FROM SalesPerson WHERE  (1 = 0)
> > 12/01/20 18:02:35 DEBUG manager.SqlManager: Using fetchSize for next query:
> > -2147483648
> > 12/01/20 18:02:35 INFO manager.SqlManager: Executing SQL statement: SELECT
> > *, 87 AS JobID FROM SalesPerson WHERE  (1 = 0)
> > 12/01/20 18:02:35 DEBUG orm.ClassWriter: selected columns:
> > 12/01/20 18:02:35 DEBUG orm.ClassWriter:   BusinessEntityID
> > 12/01/20 18:02:35 DEBUG orm.ClassWriter:   TerritoryID
> > 12/01/20 18:02:35 DEBUG orm.ClassWriter:   SalesQuota
> > 12/01/20 18:02:35 DEBUG orm.ClassWriter:   Bonus
> > 12/01/20 18:02:35 DEBUG orm.ClassWriter:   CommissionPct
> > 12/01/20 18:02:35 DEBUG orm.ClassWriter:   SalesYTD
> > 12/01/20 18:02:35 DEBUG orm.ClassWriter:   SalesLastYear
> > 12/01/20 18:02:35 DEBUG orm.ClassWriter:   rowguid
> > 12/01/20 18:02:35 DEBUG orm.ClassWriter:   ModifiedDate
> > 12/01/20 18:02:35 DEBUG orm.ClassWriter:   JobID
> > 12/01/20 18:02:35 DEBUG orm.ClassWriter: Writing source file:
> > /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/QueryResult.java
> > 12/01/20 18:02:35 DEBUG orm.ClassWriter: Table name: null
> > 12/01/20 18:02:35 DEBUG orm.ClassWriter: Columns: BusinessEntityID:4,
> > TerritoryID:4, SalesQuota:3, Bonus:3, CommissionPct:3, SalesYTD:3,
> > SalesLastYear:3, rowguid:12, ModifiedDate:93, JobID:-5,
> > 12/01/20 18:02:35 DEBUG orm.ClassWriter: sourceFilename is QueryResult.java
> > 12/01/20 18:02:35 DEBUG orm.CompilationManager: Found existing
> > /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/
> > 12/01/20 18:02:35 INFO orm.CompilationManager: HADOOP_HOME is
> > /usr/lib/hadoop
> > 12/01/20 18:02:35 INFO orm.CompilationManager: Found hadoop core jar at:
> > /usr/lib/hadoop/hadoop-core.jar
> > 12/01/20 18:02:35 DEBUG orm.CompilationManager: Adding source file:
> > /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/QueryResult.java
> > 12/01/20 18:02:35 DEBUG orm.CompilationManager: Invoking javac with args:
> > 12/01/20 18:02:35 DEBUG orm.CompilationManager:   -sourcepath
> > 12/01/20 18:02:35 DEBUG orm.CompilationManager:
> > /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/
> > 12/01/20 18:02:35 DEBUG orm.CompilationManager:   -d
> > 12/01/20 18:02:35 DEBUG orm.CompilationManager:
> > /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/
> > 12/01/20 18:02:35 DEBUG orm.CompilationManager:   -classpath
> > 12/01/20 18:02:35 DEBUG orm.CompilationManager:
> > /usr/lib/hadoop/conf:/usr/java/jdk1.6.0_21/lib/tools.jar:/usr/lib/hadoop:/usr/lib/hadoop/hadoop-core-0.20.2-cdh3u2.jar:/usr/lib/hadoop/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop/lib/commons-net-1.4.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/hadoop-fairscheduler-0.20.2-cdh3u2.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop/lib/hue-plugins-1.2.0-cdh3u2.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop/lib/jsch-0.1.42.jar:/usr/lib/hadoop/lib/junit-4.5.jar:/usr/lib/hadoop/lib/kfs-0.2.2.jar:/usr/lib/hadoop/lib/log4j-1.2.15.jar:/usr/lib/hadoop/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop/lib/oro-2.0.8.jar:/usr/lib/hadoop/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/usr/lib/hadoop/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop/lib/jsp-2.1/jsp-api-2.1.jar:/usr/lib/sqoop/conf:/etc/zookeeper::/usr/lib/sqoop/lib/ant-contrib-1.0b3.jar:/usr/lib/sqoop/lib/ant-eclipse-1.0-jvm1.2.jar:/usr/lib/sqoop/lib/avro-1.5.4.jar:/usr/lib/sqoop/lib/avro-ipc-1.5.4.jar:/usr/lib/sqoop/lib/avro-mapred-1.5.4.jar:/usr/lib/sqoop/lib/commons-io-1.4.jar:/usr/lib/sqoop/lib/ivy-2.0.0-rc2.jar:/usr/lib/sqoop/lib/jackson-core-asl-1.7.3.jar:/usr/lib/sqoop/lib/jackson-mapper-asl-1.7.3.jar:/usr/lib/sqoop/lib/jopt-simple-3.2.jar:/usr/lib/sqoop/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/sqoop/lib/paranamer-2.3.jar:/usr/lib/sqoop/lib/snappy-java-1.0.3.2.jar:/usr/lib/sqoop/lib/sqljdbc4.jar:/usr/lib/sqoop/lib/sqoop-sqlserver-1.0.jar:/usr/lib/hadoop/conf:/usr/lib/hadoop/hadoop-core-0.20.2-cdh3u2.jar:/usr/lib/hadoop/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop/lib/commons-net-1.4.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/hadoop-fairscheduler-0.20.2-cdh3u2.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop/lib/hue-plugins-1.2.0-cdh3u2.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop/lib/jsch-0.1.42.jar:/usr/lib/hadoop/lib/junit-4.5.jar:/usr/lib/hadoop/lib/kfs-0.2.2.jar:/usr/lib/hadoop/lib/log4j-1.2.15.jar:/usr/lib/hadoop/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop/lib/oro-2.0.8.jar:/usr/lib/hadoop/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/usr/lib/hbase/bin/../conf:/usr/java/jdk1.6.0_21/lib/tools.jar:/usr/lib/hbase/bin/..:/usr/lib/hbase/bin/../hbase-0.90.4-cdh3u2.jar:/usr/lib/hbase/bin/../hbase-0.90.4-cdh3u2-tests.jar:/usr/lib/hbase/bin/../lib/activation-1.1.jar:/usr/lib/hbase/bin/../lib/asm-3.1.jar:/usr/lib/hbase/bin/../lib/avro-1.5.4.jar:/usr/lib/hbase/bin/../lib/avro-ipc-1.5.4.jar:/usr/lib/hbase/bin/../lib/commons-cli-1.2.jar:/usr/lib/hbase/bin/../lib/commons-codec-1.4.jar:/usr/lib/hbase/bin/../lib/commons-el-1.0.jar:/usr/lib/hbase/bin/../lib/commons-httpclient-3.1.jar:/usr/lib/hbase/bin/../lib/commons-lang-2.5.jar:/usr/lib/hbase/bin/../lib/commons-logging-1.1.1.jar:/usr/lib/hbase/bin/../lib/commons-net-1.4.1.jar:/usr/lib/hbase/bin/../lib/core-3.1.1.jar:/usr/lib/hbase/bin/../lib/guava-r06.jar:/usr/lib/hbase/bin/../lib/hadoop-core.jar:/usr/lib/hbase/bin/../lib/jackson-core-asl-1.5.2.jar:/usr/lib/hbase/bin/../lib/jackson-jaxrs-1.5.5.jar:/usr/lib/hbase/bin/../lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hbase/bin/../lib/jackson-xc-1.5.5.jar:/usr/lib/hbase/bin/../lib/jamon-runtime-2.3.1.jar:/usr/lib/hbase/bin/../lib/jasper-compiler-5.5.23.jar:/usr/lib/hbase/bin/../lib/jasper-runtime-5.5.23.jar:/usr/lib/hbase/bin/../lib/jaxb-api-2.1.jar:/usr/lib/hbase/bin/../lib/jaxb-impl-2.1.12.jar:/usr/lib/hbase/bin/../lib/jersey-core-1.4.jar:/usr/lib/hbase/bin/../lib/jersey-json-1.4.jar:/usr/lib/hbase/bin/../lib/jersey-server-1.4.jar:/usr/lib/hbase/bin/../lib/jettison-1.1.jar:/usr/lib/hbase/bin/../lib/jetty-6.1.26.jar:/usr/lib/hbase/bin/../lib/jetty-util-6.1.26.jar:/usr/lib/hbase/bin/../lib/jruby-complete-1.6.0.jar:/usr/lib/hbase/bin/../lib/jsp-2.1-6.1.14.jar:/usr/lib/hbase/bin/../lib/jsp-api-2.1-6.1.14.jar:/usr/lib/hbase/bin/../lib/jsp-api-2.1.jar:/usr/lib/hbase/bin/../lib/jsr311-api-1.1.1.jar:/usr/lib/hbase/bin/../lib/log4j-1.2.16.jar:/usr/lib/hbase/bin/../lib/netty-3.2.4.Final.jar:/usr/lib/hbase/bin/../lib/protobuf-java-2.3.0.jar:/usr/lib/hbase/bin/../lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hbase/bin/../lib/servlet-api-2.5.jar:/usr/lib/hbase/bin/../lib/slf4j-api-1.5.8.jar:/usr/lib/hbase/bin/../lib/slf4j-log4j12-1.5.8.jar:/usr/lib/hbase/bin/../lib/snappy-java-1.0.3.2.jar:/usr/lib/hbase/bin/../lib/stax-api-1.0.1.jar:/usr/lib/hbase/bin/../lib/thrift-0.2.0.jar:/usr/lib/hbase/bin/../lib/velocity-1.5.jar:/usr/lib/hbase/bin/../lib/xmlenc-0.52.jar:/usr/lib/hbase/bin/../lib/zookeeper.jar:/usr/lib/sqoop/sqoop-1.3.0-cdh3u2.jar:/usr/lib/sqoop/sqoop-test-1.3.0-cdh3u2.jar::/usr/lib/hadoop/hadoop-core.jar:/usr/lib/sqoop/sqoop-1.3.0-cdh3u2.jar
> > 12/01/20 18:02:36 ERROR orm.CompilationManager: Could not rename
> > /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/QueryResult.java
> > to /home/cloudera/./QueryResult.java
> > java.io.IOException: Destination '/home/cloudera/./QueryResult.java' already
> > exists
> >         at org.apache.commons.io.FileUtils.moveFile(FileUtils.java:1811)
> >         at
> > com.cloudera.sqoop.orm.CompilationManager.compile(CompilationManager.java:229)
> >         at
> > com.cloudera.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:85)
> >         at
> > com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:369)
> >         at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
> >         at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
> >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >         at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
> >         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
> >         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
> >         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> > 12/01/20 18:02:36 INFO orm.CompilationManager: Writing jar file:
> > /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/QueryResult.jar
> > 12/01/20 18:02:36 DEBUG orm.CompilationManager: Scanning for .class files in
> > directory: /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99
> > 12/01/20 18:02:36 DEBUG orm.CompilationManager: Got classfile:
> > /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/QueryResult.class
> > -> QueryResult.class
> > 12/01/20 18:02:36 DEBUG orm.CompilationManager: Finished writing jar file
> > /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/QueryResult.jar
> > 12/01/20 18:02:36 INFO mapreduce.ImportJobBase: Beginning query import.
> > 12/01/20 18:02:37 DEBUG mapreduce.DataDrivenImportJob: Using table class:
> > QueryResult
> > 12/01/20 18:02:37 DEBUG mapreduce.DataDrivenImportJob: Using InputFormat:
> > class com.cloudera.sqoop.mapreduce.db.DataDrivenDBInputFormat
> > 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> > file:/usr/lib/sqoop/sqoop-1.3.0-cdh3u2.jar
> > 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> > file:/usr/lib/sqoop/lib/mysql-connector-java-5.0.8-bin.jar
> > 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> > file:/usr/lib/sqoop/sqoop-1.3.0-cdh3u2.jar
> > 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> > file:/usr/lib/sqoop/sqoop-1.3.0-cdh3u2.jar
> > 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> > file:/usr/lib/sqoop/lib/paranamer-2.3.jar
> > 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> > file:/usr/lib/sqoop/lib/jackson-core-asl-1.7.3.jar
> > 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> > file:/usr/lib/sqoop/lib/jackson-mapper-asl-1.7.3.jar
> > 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> > file:/usr/lib/sqoop/lib/sqljdbc4.jar
> > 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> > file:/usr/lib/sqoop/lib/avro-mapred-1.5.4.jar
> > 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> > file:/usr/lib/sqoop/lib/ant-eclipse-1.0-jvm1.2.jar
> > 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> > file:/usr/lib/sqoop/lib/avro-1.5.4.jar
> > 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> > file:/usr/lib/sqoop/lib/avro-ipc-1.5.4.jar
> > 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> > file:/usr/lib/sqoop/lib/sqoop-sqlserver-1.0.jar
> > 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> > file:/usr/lib/sqoop/lib/commons-io-1.4.jar
> > 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> > file:/usr/lib/sqoop/lib/mysql-connector-java-5.0.8-bin.jar
> > 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> > file:/usr/lib/sqoop/lib/ivy-2.0.0-rc2.jar
> > 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> > file:/usr/lib/sqoop/lib/ant-contrib-1.0b3.jar
> > 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> > file:/usr/lib/sqoop/lib/snappy-java-1.0.3.2.jar
> > 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> > file:/usr/lib/sqoop/lib/jopt-simple-3.2.jar
> > 12/01/20 18:02:38 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT
> > MIN(BusinessEntityID), MAX(BusinessEntityID) FROM (SELECT *, 87 AS JobID
> > FROM SalesPerson WHERE  (1 = 1) ) AS t1
> > 12/01/20 18:02:38 DEBUG db.IntegerSplitter: Splits:
> > [                         274 to                          290] into 4 parts
> > 12/01/20 18:02:38 DEBUG db.IntegerSplitter:                          274
> > 12/01/20 18:02:38 DEBUG db.IntegerSplitter:                          278
> > 12/01/20 18:02:38 DEBUG db.IntegerSplitter:                          282
> > 12/01/20 18:02:38 DEBUG db.IntegerSplitter:                          286
> > 12/01/20 18:02:38 DEBUG db.IntegerSplitter:                          290
> > 12/01/20 18:02:39 INFO mapred.JobClient: Running job: job_201201201632_0008
> > 12/01/20 18:02:40 INFO mapred.JobClient:  map 0% reduce 0%
> > 12/01/20 18:02:54 INFO mapred.JobClient:  map 50% reduce 0%
> > 12/01/20 18:02:59 INFO mapred.JobClient:  map 75% reduce 0%
> > 12/01/20 18:03:00 INFO mapred.JobClient:  map 100% reduce 0%
> > 12/01/20 18:03:02 INFO mapred.JobClient: Job complete: job_201201201632_0008
> > 12/01/20 18:03:02 INFO mapred.JobClient: Counters: 12
> > 12/01/20 18:03:02 INFO mapred.JobClient:   Job Counters
> > 12/01/20 18:03:02 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=28816
> > 12/01/20 18:03:02 INFO mapred.JobClient:     Total time spent by all reduces
> > waiting after reserving slots (ms)=0
> > 12/01/20 18:03:02 INFO mapred.JobClient:     Total time spent by all maps
> > waiting after reserving slots (ms)=0
> > 12/01/20 18:03:02 INFO mapred.JobClient:     Launched map tasks=4
> > 12/01/20 18:03:02 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> > 12/01/20 18:03:02 INFO mapred.JobClient:   FileSystemCounters
> > 12/01/20 18:03:02 INFO mapred.JobClient:     HDFS_BYTES_READ=505
> > 12/01/20 18:03:02 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=270332
> > 12/01/20 18:03:02 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=1867
> > 12/01/20 18:03:02 INFO mapred.JobClient:   Map-Reduce Framework
> > 12/01/20 18:03:02 INFO mapred.JobClient:     Map input records=17
> > 12/01/20 18:03:02 INFO mapred.JobClient:     Spilled Records=0
> > 12/01/20 18:03:02 INFO mapred.JobClient:     Map output records=17
> > 12/01/20 18:03:02 INFO mapred.JobClient:     SPLIT_RAW_BYTES=505
> > 12/01/20 18:03:02 INFO mapreduce.ImportJobBase: Transferred 1.8232 KB in
> > 25.4856 seconds (73.2572 bytes/sec)
> > 12/01/20 18:03:02 INFO mapreduce.ImportJobBase: Retrieved 17 records.
> > [cloudera@localhost ~]$
> >
> >
> >> From: kathleen@cloudera.com
> >> Date: Fri, 20 Jan 2012 14:41:32 -0800
> >
> >> Subject: Re: The --hive-overwrite doesn't overwrite data
> >> To: sqoop-user@incubator.apache.org
> >>
> >> Dave - to aid in debugging, please re-run your Sqoop job with the
> >> --verbose flag and then paste the console log.
> >>
> >> Thanks, Kathleen
> >>
> >> > On Fri, Jan 20, 2012 at 11:51 AM, David Langer
> >> > <da...@hotmail.com> wrote:
> >> >> Greetings!
> >> >>
> >> >> Hopefully this isn't too much of a newbie question, but I am unable to
> >> >> get
> >> >> the --hive-overwrite argument working. I'm using sqoop 1.3.0-cdh3u2 on
> >> >> the
> >> >> Cloudera VMWare Player VM.
> >> >>
> >> >>
> >> >> The following sqoop invocation succeeds in creating the Hive table and
> >> >> populates it with data:
> >> >>
> >> >> sqoop import --connect
> >> >> 'jdbc:mysql://localhost/MyDB?zeroDateTimeBehavior=round' --username
> >> >> cloudera
> >> >> --query 'SELECT *, 47 AS JobID FROM SalesPerson WHERE $CONDITIONS'
> >> >> --split-by ID  --target-dir /tmp/SalesPerson --create-hive-table
> >> >> --hive-import --hive-table MyDB_SalesPerson
> >> >>
> >> >>
> >> >> However, while the following sqoop invocation does produce the desired
> >> >> data
> >> >> in HDFS (i.e., /tmp/SalesPerson) it does not overwrite the data in the
> >> >> Hive
> >> >> table:
> >> >>
> >> >> sqoop import --connect
> >> >> 'jdbc:mysql://localhost/MyDB?zeroDateTimeBehavior=round' --username
> >> >> cloudera
> >> >> --query 'SELECT *, 87 AS JobID FROM SalesPerson WHERE $CONDITIONS'
> >> >> --split-by ID  --target-dir /tmp/SalesPerson --hive-overwrite
> >> >> --hive-table
> >> >> MyDB_salesperson
> >> >>
> >> >>
> >> >> There is nothing in Hive.log that indicates the --hive-overwrite sqoop
> >> >> invocation is interacting with Hive (e.g., no exceptions).
> >> >>
> >> >> Any assistance would be greatly appreciated.
> >> >>
> >> >> Thanx,
> >> >>
> >> >> Dave

Re: The --hive-overwrite doesn't overwrite data

Posted by Kathleen Ting <ka...@cloudera.com>.

Dave - can you try adding the --hive-import option?

Regards, Kathleen

On Fri, Jan 20, 2012 at 3:07 PM, David Langer <da...@hotmail.com> wrote:
> Sure. Here it is:
>
> [cloudera@localhost ~]$ hive;
> Hive history
> file=/tmp/cloudera/hive_job_log_cloudera_201201201806_30238324.txt
> hive> show tables;
> OK
> ndw_adventureworks_salesperson
> Time taken: 3.716 seconds
> hive> quit;
> [cloudera@localhost ~]$ sqoop import --connect
> 'jdbc:mysql://localhost/AdventureWorks?zeroDateTimeBehavior=round'
> --username cloudera --query 'SELECT *, 87 AS JobID FROM SalesPerson WHERE
> $CONDITIONS' --split-by BusinessEntityID  --target-dir /tmp/SalesPerson
> --hive-overwrite --hive-table NDW_AdventureWorks_SalesPerson --verbose
> 12/01/20 18:02:34 DEBUG tool.BaseSqoopTool: Enabled debug logging.
> 12/01/20 18:02:34 DEBUG sqoop.ConnFactory: Added factory
> com.microsoft.sqoop.SqlServer.MSSQLServerManagerFactory specified by
> /usr/lib/sqoop/conf/managers.d/mssqoop-sqlserver
> 12/01/20 18:02:34 DEBUG sqoop.ConnFactory: Loaded manager factory:
> com.microsoft.sqoop.SqlServer.MSSQLServerManagerFactory
> 12/01/20 18:02:34 DEBUG sqoop.ConnFactory: Loaded manager factory:
> com.cloudera.sqoop.manager.DefaultManagerFactory
> 12/01/20 18:02:34 DEBUG sqoop.ConnFactory: Trying ManagerFactory:
> com.microsoft.sqoop.SqlServer.MSSQLServerManagerFactory
> 12/01/20 18:02:34 DEBUG sqoop.ConnFactory: Trying ManagerFactory:
> com.cloudera.sqoop.manager.DefaultManagerFactory
> 12/01/20 18:02:34 DEBUG manager.DefaultManagerFactory: Trying with scheme:
> jdbc:mysql:
> 12/01/20 18:02:34 INFO manager.MySQLManager: Preparing to use a MySQL
> streaming resultset.
> 12/01/20 18:02:34 DEBUG sqoop.ConnFactory: Instantiated ConnManager
> com.cloudera.sqoop.manager.MySQLManager@303020ad
> 12/01/20 18:02:34 INFO tool.CodeGenTool: Beginning code generation
> 12/01/20 18:02:35 DEBUG manager.SqlManager: No connection paramenters
> specified. Using regular API for making connection.
> 12/01/20 18:02:35 DEBUG manager.SqlManager: Using fetchSize for next query:
> -2147483648
> 12/01/20 18:02:35 INFO manager.SqlManager: Executing SQL statement: SELECT
> *, 87 AS JobID FROM SalesPerson WHERE  (1 = 0)
> 12/01/20 18:02:35 DEBUG manager.SqlManager: Using fetchSize for next query:
> -2147483648
> 12/01/20 18:02:35 INFO manager.SqlManager: Executing SQL statement: SELECT
> *, 87 AS JobID FROM SalesPerson WHERE  (1 = 0)
> 12/01/20 18:02:35 DEBUG orm.ClassWriter: selected columns:
> 12/01/20 18:02:35 DEBUG orm.ClassWriter:   BusinessEntityID
> 12/01/20 18:02:35 DEBUG orm.ClassWriter:   TerritoryID
> 12/01/20 18:02:35 DEBUG orm.ClassWriter:   SalesQuota
> 12/01/20 18:02:35 DEBUG orm.ClassWriter:   Bonus
> 12/01/20 18:02:35 DEBUG orm.ClassWriter:   CommissionPct
> 12/01/20 18:02:35 DEBUG orm.ClassWriter:   SalesYTD
> 12/01/20 18:02:35 DEBUG orm.ClassWriter:   SalesLastYear
> 12/01/20 18:02:35 DEBUG orm.ClassWriter:   rowguid
> 12/01/20 18:02:35 DEBUG orm.ClassWriter:   ModifiedDate
> 12/01/20 18:02:35 DEBUG orm.ClassWriter:   JobID
> 12/01/20 18:02:35 DEBUG orm.ClassWriter: Writing source file:
> /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/QueryResult.java
> 12/01/20 18:02:35 DEBUG orm.ClassWriter: Table name: null
> 12/01/20 18:02:35 DEBUG orm.ClassWriter: Columns: BusinessEntityID:4,
> TerritoryID:4, SalesQuota:3, Bonus:3, CommissionPct:3, SalesYTD:3,
> SalesLastYear:3, rowguid:12, ModifiedDate:93, JobID:-5,
> 12/01/20 18:02:35 DEBUG orm.ClassWriter: sourceFilename is QueryResult.java
> 12/01/20 18:02:35 DEBUG orm.CompilationManager: Found existing
> /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/
> 12/01/20 18:02:35 INFO orm.CompilationManager: HADOOP_HOME is
> /usr/lib/hadoop
> 12/01/20 18:02:35 INFO orm.CompilationManager: Found hadoop core jar at:
> /usr/lib/hadoop/hadoop-core.jar
> 12/01/20 18:02:35 DEBUG orm.CompilationManager: Adding source file:
> /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/QueryResult.java
> 12/01/20 18:02:35 DEBUG orm.CompilationManager: Invoking javac with args:
> 12/01/20 18:02:35 DEBUG orm.CompilationManager:   -sourcepath
> 12/01/20 18:02:35 DEBUG orm.CompilationManager:
> /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/
> 12/01/20 18:02:35 DEBUG orm.CompilationManager:   -d
> 12/01/20 18:02:35 DEBUG orm.CompilationManager:
> /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/
> 12/01/20 18:02:35 DEBUG orm.CompilationManager:   -classpath
> 12/01/20 18:02:35 DEBUG orm.CompilationManager:
> /usr/lib/hadoop/conf:/usr/java/jdk1.6.0_21/lib/tools.jar:/usr/lib/hadoop:/usr/lib/hadoop/hadoop-core-0.20.2-cdh3u2.jar:/usr/lib/hadoop/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop/lib/commons-net-1.4.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/hadoop-fairscheduler-0.20.2-cdh3u2.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop/lib/hue-plugins-1.2.0-cdh3u2.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop/lib/jsch-0.1.42.jar:/usr/lib/hadoop/lib/junit-4.5.jar:/usr/lib/hadoop/lib/kfs-0.2.2.jar:/usr/lib/hadoop/lib/log4j-1.2.15.jar:/usr/lib/hadoop/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop/lib/oro-2.0.8.jar:/usr/lib/hadoop/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/usr/lib/hadoop/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop/lib/jsp-2.1/jsp-api-2.1.jar:/usr/lib/sqoop/conf:/etc/zookeeper::/usr/lib/sqoop/lib/ant-contrib-1.0b3.jar:/usr/lib/sqoop/lib/ant-eclipse-1.0-jvm1.2.jar:/usr/lib/sqoop/lib/avro-1.5.4.jar:/usr/lib/sqoop/lib/avro-ipc-1.5.4.jar:/usr/lib/sqoop/lib/avro-mapred-1.5.4.jar:/usr/lib/sqoop/lib/commons-io-1.4.jar:/usr/lib/sqoop/lib/ivy-2.0.0-rc2.jar:/usr/lib/sqoop/lib/jackson-core-asl-1.7.3.jar:/usr/lib/sqoop/lib/jackson-mapper-asl-1.7.3.jar:/usr/lib/sqoop/lib/jopt-simple-3.2.jar:/usr/lib/sqoop/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/sqoop/lib/paranamer-2.3.jar:/usr/lib/sqoop/lib/snappy-java-1.0.3.2.jar:/usr/lib/sqoop/lib/sqljdbc4.jar:/usr/lib/sqoop/lib/sqoop-sqlserver-1.0.jar:/usr/lib/hadoop/conf:/usr/lib/hadoop/hadoop-core-0.20.2-cdh3u2.jar:/usr/lib/hadoop/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop/lib/commons-net-1.4.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/hadoop-fairscheduler-0.20.2-cdh3u2.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop/lib/hue-plugins-1.2.0-cdh3u2.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop/lib/jsch-0.1.42.jar:/usr/lib/hadoop/lib/junit-4.5.jar:/usr/lib/hadoop/lib/kfs-0.2.2.jar:/usr/lib/hadoop/lib/log4j-1.2.15.jar:/usr/lib/hadoop/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop/lib/oro-2.0.8.jar:/usr/lib/hadoop/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/usr/lib/hbase/bin/../conf:/usr/java/jdk1.6.0_21/lib/tools.jar:/usr/lib/hbase/bin/..:/usr/lib/hbase/bin/../hbase-0.90.4-cdh3u2.jar:/usr/lib/hbase/bin/../hbase-0.90.4-cdh3u2-tests.jar:/usr/lib/hbase/bin/../lib/activation-1.1.jar:/usr/lib/hbase/bin/../lib/asm-3.1.jar:/usr/lib/hbase/bin/../lib/avro-1.5.4.jar:/usr/lib/hbase/bin/../lib/avro-ipc-1.5.4.jar:/usr/lib/hbase/bin/../lib/commons-cli-1.2.jar:/usr/lib/hbase/bin/../lib/commons-codec-1.4.jar:/usr/lib/hbase/bin/../lib/commons-el-1.0.jar:/usr/lib/hbase/bin/../lib/commons-httpclient-3.1.jar:/usr/lib/hbase/bin/../lib/commons-lang-2.5.jar:/usr/lib/hbase/bin/../lib/commons-logging-1.1.1.jar:/usr/lib/hbase/bin/../lib/commons-net-1.4.1.jar:/usr/lib/hbase/bin/../lib/core-3.1.1.jar:/usr/lib/hbase/bin/../lib/guava-r06.jar:/usr/lib/hbase/bin/../lib/hadoop-core.jar:/usr/lib/hbase/bin/../lib/jackson-core-asl-1.5.2.jar:/usr/lib/hbase/bin/../lib/jackson-jaxrs-1.5.5.jar:/usr/lib/hbase/bin/../lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hbase/bin/../lib/jackson-xc-1.5.5.jar:/usr/lib/hbase/bin/../lib/jamon-runtime-2.3.1.jar:/usr/lib/hbase/bin/../lib/jasper-compiler-5.5.23.jar:/usr/lib/hbase/bin/../lib/jasper-runtime-5.5.23.jar:/usr/lib/hbase/bin/../lib/jaxb-api-2.1.jar:/usr/lib/hbase/bin/../lib/jaxb-impl-2.1.12.jar:/usr/lib/hbase/bin/../lib/jersey-core-1.4.jar:/usr/lib/hbase/bin/../lib/jersey-json-1.4.jar:/usr/lib/hbase/bin/../lib/jersey-server-1.4.jar:/usr/lib/hbase/bin/../lib/jettison-1.1.jar:/usr/lib/hbase/bin/../lib/jetty-6.1.26.jar:/usr/lib/hbase/bin/../lib/jetty-util-6.1.26.jar:/usr/lib/hbase/bin/../lib/jruby-complete-1.6.0.jar:/usr/lib/hbase/bin/../lib/jsp-2.1-6.1.14.jar:/usr/lib/hbase/bin/../lib/jsp-api-2.1-6.1.14.jar:/usr/lib/hbase/bin/../lib/jsp-api-2.1.jar:/usr/lib/hbase/bin/../lib/jsr311-api-1.1.1.jar:/usr/lib/hbase/bin/../lib/log4j-1.2.16.jar:/usr/lib/hbase/bin/../lib/netty-3.2.4.Final.jar:/usr/lib/hbase/bin/../lib/protobuf-java-2.3.0.jar:/usr/lib/hbase/bin/../lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hbase/bin/../lib/servlet-api-2.5.jar:/usr/lib/hbase/bin/../lib/slf4j-api-1.5.8.jar:/usr/lib/hbase/bin/../lib/slf4j-log4j12-1.5.8.jar:/usr/lib/hbase/bin/../lib/snappy-java-1.0.3.2.jar:/usr/lib/hbase/bin/../lib/stax-api-1.0.1.jar:/usr/lib/hbase/bin/../lib/thrift-0.2.0.jar:/usr/lib/hbase/bin/../lib/velocity-1.5.jar:/usr/lib/hbase/bin/../lib/xmlenc-0.52.jar:/usr/lib/hbase/bin/../lib/zookeeper.jar:/usr/lib/sqoop/sqoop-1.3.0-cdh3u2.jar:/usr/lib/sqoop/sqoop-test-1.3.0-cdh3u2.jar::/usr/lib/hadoop/hadoop-core.jar:/usr/lib/sqoop/sqoop-1.3.0-cdh3u2.jar
> 12/01/20 18:02:36 ERROR orm.CompilationManager: Could not rename
> /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/QueryResult.java
> to /home/cloudera/./QueryResult.java
> java.io.IOException: Destination '/home/cloudera/./QueryResult.java' already
> exists
>         at org.apache.commons.io.FileUtils.moveFile(FileUtils.java:1811)
>         at
> com.cloudera.sqoop.orm.CompilationManager.compile(CompilationManager.java:229)
>         at
> com.cloudera.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:85)
>         at
> com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:369)
>         at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
>         at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> 12/01/20 18:02:36 INFO orm.CompilationManager: Writing jar file:
> /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/QueryResult.jar
> 12/01/20 18:02:36 DEBUG orm.CompilationManager: Scanning for .class files in
> directory: /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99
> 12/01/20 18:02:36 DEBUG orm.CompilationManager: Got classfile:
> /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/QueryResult.class
> -> QueryResult.class
> 12/01/20 18:02:36 DEBUG orm.CompilationManager: Finished writing jar file
> /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/QueryResult.jar
> 12/01/20 18:02:36 INFO mapreduce.ImportJobBase: Beginning query import.
> 12/01/20 18:02:37 DEBUG mapreduce.DataDrivenImportJob: Using table class:
> QueryResult
> 12/01/20 18:02:37 DEBUG mapreduce.DataDrivenImportJob: Using InputFormat:
> class com.cloudera.sqoop.mapreduce.db.DataDrivenDBInputFormat
> 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/usr/lib/sqoop/sqoop-1.3.0-cdh3u2.jar
> 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/usr/lib/sqoop/lib/mysql-connector-java-5.0.8-bin.jar
> 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/usr/lib/sqoop/sqoop-1.3.0-cdh3u2.jar
> 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/usr/lib/sqoop/sqoop-1.3.0-cdh3u2.jar
> 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/usr/lib/sqoop/lib/paranamer-2.3.jar
> 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/usr/lib/sqoop/lib/jackson-core-asl-1.7.3.jar
> 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/usr/lib/sqoop/lib/jackson-mapper-asl-1.7.3.jar
> 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/usr/lib/sqoop/lib/sqljdbc4.jar
> 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/usr/lib/sqoop/lib/avro-mapred-1.5.4.jar
> 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/usr/lib/sqoop/lib/ant-eclipse-1.0-jvm1.2.jar
> 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/usr/lib/sqoop/lib/avro-1.5.4.jar
> 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/usr/lib/sqoop/lib/avro-ipc-1.5.4.jar
> 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/usr/lib/sqoop/lib/sqoop-sqlserver-1.0.jar
> 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/usr/lib/sqoop/lib/commons-io-1.4.jar
> 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/usr/lib/sqoop/lib/mysql-connector-java-5.0.8-bin.jar
> 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/usr/lib/sqoop/lib/ivy-2.0.0-rc2.jar
> 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/usr/lib/sqoop/lib/ant-contrib-1.0b3.jar
> 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/usr/lib/sqoop/lib/snappy-java-1.0.3.2.jar
> 12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/usr/lib/sqoop/lib/jopt-simple-3.2.jar
> 12/01/20 18:02:38 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT
> MIN(BusinessEntityID), MAX(BusinessEntityID) FROM (SELECT *, 87 AS JobID
> FROM SalesPerson WHERE  (1 = 1) ) AS t1
> 12/01/20 18:02:38 DEBUG db.IntegerSplitter: Splits:
> [                         274 to                          290] into 4 parts
> 12/01/20 18:02:38 DEBUG db.IntegerSplitter:                          274
> 12/01/20 18:02:38 DEBUG db.IntegerSplitter:                          278
> 12/01/20 18:02:38 DEBUG db.IntegerSplitter:                          282
> 12/01/20 18:02:38 DEBUG db.IntegerSplitter:                          286
> 12/01/20 18:02:38 DEBUG db.IntegerSplitter:                          290
> 12/01/20 18:02:39 INFO mapred.JobClient: Running job: job_201201201632_0008
> 12/01/20 18:02:40 INFO mapred.JobClient:  map 0% reduce 0%
> 12/01/20 18:02:54 INFO mapred.JobClient:  map 50% reduce 0%
> 12/01/20 18:02:59 INFO mapred.JobClient:  map 75% reduce 0%
> 12/01/20 18:03:00 INFO mapred.JobClient:  map 100% reduce 0%
> 12/01/20 18:03:02 INFO mapred.JobClient: Job complete: job_201201201632_0008
> 12/01/20 18:03:02 INFO mapred.JobClient: Counters: 12
> 12/01/20 18:03:02 INFO mapred.JobClient:   Job Counters
> 12/01/20 18:03:02 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=28816
> 12/01/20 18:03:02 INFO mapred.JobClient:     Total time spent by all reduces
> waiting after reserving slots (ms)=0
> 12/01/20 18:03:02 INFO mapred.JobClient:     Total time spent by all maps
> waiting after reserving slots (ms)=0
> 12/01/20 18:03:02 INFO mapred.JobClient:     Launched map tasks=4
> 12/01/20 18:03:02 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> 12/01/20 18:03:02 INFO mapred.JobClient:   FileSystemCounters
> 12/01/20 18:03:02 INFO mapred.JobClient:     HDFS_BYTES_READ=505
> 12/01/20 18:03:02 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=270332
> 12/01/20 18:03:02 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=1867
> 12/01/20 18:03:02 INFO mapred.JobClient:   Map-Reduce Framework
> 12/01/20 18:03:02 INFO mapred.JobClient:     Map input records=17
> 12/01/20 18:03:02 INFO mapred.JobClient:     Spilled Records=0
> 12/01/20 18:03:02 INFO mapred.JobClient:     Map output records=17
> 12/01/20 18:03:02 INFO mapred.JobClient:     SPLIT_RAW_BYTES=505
> 12/01/20 18:03:02 INFO mapreduce.ImportJobBase: Transferred 1.8232 KB in
> 25.4856 seconds (73.2572 bytes/sec)
> 12/01/20 18:03:02 INFO mapreduce.ImportJobBase: Retrieved 17 records.
> [cloudera@localhost ~]$
>
>
>> From: kathleen@cloudera.com
>> Date: Fri, 20 Jan 2012 14:41:32 -0800
>
>> Subject: Re: The --hive-overwrite doesn't overwrite data
>> To: sqoop-user@incubator.apache.org
>>
>> Dave - to aid in debugging, please re-run your Sqoop job with the
>> --verbose flag and then paste the console log.
>>
>> Thanks, Kathleen
>>
>> > On Fri, Jan 20, 2012 at 11:51 AM, David Langer
>> > <da...@hotmail.com> wrote:
>> >> Greetings!
>> >>
>> >> Hopefully this isn't too much of a newbie question, but I am unable to
>> >> get
>> >> the --hive-overwrite argument working. I'm using sqoop 1.3.0-cdh3u2 on
>> >> the
>> >> Cloudera VMWare Player VM.
>> >>
>> >>
>> >> The following sqoop invocation succeeds in creating the Hive table and
>> >> populates it with data:
>> >>
>> >> sqoop import --connect
>> >> 'jdbc:mysql://localhost/MyDB?zeroDateTimeBehavior=round' --username
>> >> cloudera
>> >> --query 'SELECT *, 47 AS JobID FROM SalesPerson WHERE $CONDITIONS'
>> >> --split-by ID  --target-dir /tmp/SalesPerson --create-hive-table
>> >> --hive-import --hive-table MyDB_SalesPerson
>> >>
>> >>
>> >> However, while the following sqoop invocation does produce the desired
>> >> data
>> >> in HDFS (i.e., /tmp/SalesPerson) it does not overwrite the data in the
>> >> Hive
>> >> table:
>> >>
>> >> sqoop import --connect
>> >> 'jdbc:mysql://localhost/MyDB?zeroDateTimeBehavior=round' --username
>> >> cloudera
>> >> --query 'SELECT *, 87 AS JobID FROM SalesPerson WHERE $CONDITIONS'
>> >> --split-by ID  --target-dir /tmp/SalesPerson --hive-overwrite
>> >> --hive-table
>> >> MyDB_salesperson
>> >>
>> >>
>> >> There is nothing in Hive.log that indicates the --hive-overwrite sqoop
>> >> invocation is interacting with Hive (e.g., no exceptions).
>> >>
>> >> Any assistance would be greatly appreciated.
>> >>
>> >> Thanx,
>> >>
>> >> Dave

RE: The --hive-overwrite doesn't overwrite data

Posted by David Langer <da...@hotmail.com>.

Sure. Here it is:
 
[cloudera@localhost ~]$ hive;
Hive history file=/tmp/cloudera/hive_job_log_cloudera_201201201806_30238324.txt
hive> show tables;
OK
ndw_adventureworks_salesperson
Time taken: 3.716 seconds
hive> quit;
[cloudera@localhost ~]$ sqoop import --connect 'jdbc:mysql://localhost/AdventureWorks?zeroDateTimeBehavior=round' --username cloudera --query 'SELECT *, 87 AS JobID FROM SalesPerson WHERE $CONDITIONS' --split-by BusinessEntityID  --target-dir /tmp/SalesPerson --hive-overwrite --hive-table NDW_AdventureWorks_SalesPerson --verbose
12/01/20 18:02:34 DEBUG tool.BaseSqoopTool: Enabled debug logging.
12/01/20 18:02:34 DEBUG sqoop.ConnFactory: Added factory com.microsoft.sqoop.SqlServer.MSSQLServerManagerFactory specified by /usr/lib/sqoop/conf/managers.d/mssqoop-sqlserver
12/01/20 18:02:34 DEBUG sqoop.ConnFactory: Loaded manager factory: com.microsoft.sqoop.SqlServer.MSSQLServerManagerFactory
12/01/20 18:02:34 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory
12/01/20 18:02:34 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.microsoft.sqoop.SqlServer.MSSQLServerManagerFactory
12/01/20 18:02:34 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.cloudera.sqoop.manager.DefaultManagerFactory
12/01/20 18:02:34 DEBUG manager.DefaultManagerFactory: Trying with scheme: jdbc:mysql:
12/01/20 18:02:34 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
12/01/20 18:02:34 DEBUG sqoop.ConnFactory: Instantiated ConnManager com.cloudera.sqoop.manager.MySQLManager@303020ad
12/01/20 18:02:34 INFO tool.CodeGenTool: Beginning code generation
12/01/20 18:02:35 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection.
12/01/20 18:02:35 DEBUG manager.SqlManager: Using fetchSize for next query: -2147483648
12/01/20 18:02:35 INFO manager.SqlManager: Executing SQL statement: SELECT *, 87 AS JobID FROM SalesPerson WHERE  (1 = 0) 
12/01/20 18:02:35 DEBUG manager.SqlManager: Using fetchSize for next query: -2147483648
12/01/20 18:02:35 INFO manager.SqlManager: Executing SQL statement: SELECT *, 87 AS JobID FROM SalesPerson WHERE  (1 = 0) 
12/01/20 18:02:35 DEBUG orm.ClassWriter: selected columns:
12/01/20 18:02:35 DEBUG orm.ClassWriter:   BusinessEntityID
12/01/20 18:02:35 DEBUG orm.ClassWriter:   TerritoryID
12/01/20 18:02:35 DEBUG orm.ClassWriter:   SalesQuota
12/01/20 18:02:35 DEBUG orm.ClassWriter:   Bonus
12/01/20 18:02:35 DEBUG orm.ClassWriter:   CommissionPct
12/01/20 18:02:35 DEBUG orm.ClassWriter:   SalesYTD
12/01/20 18:02:35 DEBUG orm.ClassWriter:   SalesLastYear
12/01/20 18:02:35 DEBUG orm.ClassWriter:   rowguid
12/01/20 18:02:35 DEBUG orm.ClassWriter:   ModifiedDate
12/01/20 18:02:35 DEBUG orm.ClassWriter:   JobID
12/01/20 18:02:35 DEBUG orm.ClassWriter: Writing source file: /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/QueryResult.java
12/01/20 18:02:35 DEBUG orm.ClassWriter: Table name: null
12/01/20 18:02:35 DEBUG orm.ClassWriter: Columns: BusinessEntityID:4, TerritoryID:4, SalesQuota:3, Bonus:3, CommissionPct:3, SalesYTD:3, SalesLastYear:3, rowguid:12, ModifiedDate:93, JobID:-5, 
12/01/20 18:02:35 DEBUG orm.ClassWriter: sourceFilename is QueryResult.java
12/01/20 18:02:35 DEBUG orm.CompilationManager: Found existing /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/
12/01/20 18:02:35 INFO orm.CompilationManager: HADOOP_HOME is /usr/lib/hadoop
12/01/20 18:02:35 INFO orm.CompilationManager: Found hadoop core jar at: /usr/lib/hadoop/hadoop-core.jar
12/01/20 18:02:35 DEBUG orm.CompilationManager: Adding source file: /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/QueryResult.java
12/01/20 18:02:35 DEBUG orm.CompilationManager: Invoking javac with args:
12/01/20 18:02:35 DEBUG orm.CompilationManager:   -sourcepath
12/01/20 18:02:35 DEBUG orm.CompilationManager:   /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/
12/01/20 18:02:35 DEBUG orm.CompilationManager:   -d
12/01/20 18:02:35 DEBUG orm.CompilationManager:   /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/
12/01/20 18:02:35 DEBUG orm.CompilationManager:   -classpath
12/01/20 18:02:35 DEBUG orm.CompilationManager:   /usr/lib/hadoop/conf:/usr/java/jdk1.6.0_21/lib/tools.jar:/usr/lib/hadoop:/usr/lib/hadoop/hadoop-core-0.20.2-cdh3u2.jar:/usr/lib/hadoop/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop/lib/commons-net-1.4.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/hadoop-fairscheduler-0.20.2-cdh3u2.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop/lib/hue-plugins-1.2.0-cdh3u2.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop/lib/jsch-0.1.42.jar:/usr/lib/hadoop/lib/junit-4.5.jar:/usr/lib/hadoop/lib/kfs-0.2.2.jar:/usr/lib/hadoop/lib/log4j-1.2.15.jar:/usr/lib/hadoop/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop/lib/oro-2.0.8.jar:/usr/lib/hadoop/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/usr/lib/hadoop/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop/lib/jsp-2.1/jsp-api-2.1.jar:/usr/lib/sqoop/conf:/etc/zookeeper::/usr/lib/sqoop/lib/ant-contrib-1.0b3.jar:/usr/lib/sqoop/lib/ant-eclipse-1.0-jvm1.2.jar:/usr/lib/sqoop/lib/avro-1.5.4.jar:/usr/lib/sqoop/lib/avro-ipc-1.5.4.jar:/usr/lib/sqoop/lib/avro-mapred-1.5.4.jar:/usr/lib/sqoop/lib/commons-io-1.4.jar:/usr/lib/sqoop/lib/ivy-2.0.0-rc2.jar:/usr/lib/sqoop/lib/jackson-core-asl-1.7.3.jar:/usr/lib/sqoop/lib/jackson-mapper-asl-1.7.3.jar:/usr/lib/sqoop/lib/jopt-simple-3.2.jar:/usr/lib/sqoop/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/sqoop/lib/paranamer-2.3.jar:/usr/lib/sqoop/lib/snappy-java-1.0.3.2.jar:/usr/lib/sqoop/lib/sqljdbc4.jar:/usr/lib/sqoop/lib/sqoop-sqlserver-1.0.jar:/usr/lib/hadoop/conf:/usr/lib/hadoop/hadoop-core-0.20.2-cdh3u2.jar:/usr/lib/hadoop/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop/lib/commons-net-1.4.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/hadoop-fairscheduler-0.20.2-cdh3u2.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop/lib/hue-plugins-1.2.0-cdh3u2.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop/lib/jsch-0.1.42.jar:/usr/lib/hadoop/lib/junit-4.5.jar:/usr/lib/hadoop/lib/kfs-0.2.2.jar:/usr/lib/hadoop/lib/log4j-1.2.15.jar:/usr/lib/hadoop/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop/lib/oro-2.0.8.jar:/usr/lib/hadoop/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/usr/lib/hbase/bin/../conf:/usr/java/jdk1.6.0_21/lib/tools.jar:/usr/lib/hbase/bin/..:/usr/lib/hbase/bin/../hbase-0.90.4-cdh3u2.jar:/usr/lib/hbase/bin/../hbase-0.90.4-cdh3u2-tests.jar:/usr/lib/hbase/bin/../lib/activation-1.1.jar:/usr/lib/hbase/bin/../lib/asm-3.1.jar:/usr/lib/hbase/bin/../lib/avro-1.5.4.jar:/usr/lib/hbase/bin/../lib/avro-ipc-1.5.4.jar:/usr/lib/hbase/bin/../lib/commons-cli-1.2.jar:/usr/lib/hbase/bin/../lib/commons-codec-1.4.jar:/usr/lib/hbase/bin/../lib/commons-el-1.0.jar:/usr/lib/hbase/bin/../lib/commons-httpclient-3.1.jar:/usr/lib/hbase/bin/../lib/commons-lang-2.5.jar:/usr/lib/hbase/bin/../lib/commons-logging-1.1.1.jar:/usr/lib/hbase/bin/../lib/commons-net-1.4.1.jar:/usr/lib/hbase/bin/../lib/core-3.1.1.jar:/usr/lib/hbase/bin/../lib/guava-r06.jar:/usr/lib/hbase/bin/../lib/hadoop-core.jar:/usr/lib/hbase/bin/../lib/jackson-core-asl-1.5.2.jar:/usr/lib/hbase/bin/../lib/jackson-jaxrs-1.5.5.jar:/usr/lib/hbase/bin/../lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hbase/bin/../lib/jackson-xc-1.5.5.jar:/usr/lib/hbase/bin/../lib/jamon-runtime-2.3.1.jar:/usr/lib/hbase/bin/../lib/jasper-compiler-5.5.23.jar:/usr/lib/hbase/bin/../lib/jasper-runtime-5.5.23.jar:/usr/lib/hbase/bin/../lib/jaxb-api-2.1.jar:/usr/lib/hbase/bin/../lib/jaxb-impl-2.1.12.jar:/usr/lib/hbase/bin/../lib/jersey-core-1.4.jar:/usr/lib/hbase/bin/../lib/jersey-json-1.4.jar:/usr/lib/hbase/bin/../lib/jersey-server-1.4.jar:/usr/lib/hbase/bin/../lib/jettison-1.1.jar:/usr/lib/hbase/bin/../lib/jetty-6.1.26.jar:/usr/lib/hbase/bin/../lib/jetty-util-6.1.26.jar:/usr/lib/hbase/bin/../lib/jruby-complete-1.6.0.jar:/usr/lib/hbase/bin/../lib/jsp-2.1-6.1.14.jar:/usr/lib/hbase/bin/../lib/jsp-api-2.1-6.1.14.jar:/usr/lib/hbase/bin/../lib/jsp-api-2.1.jar:/usr/lib/hbase/bin/../lib/jsr311-api-1.1.1.jar:/usr/lib/hbase/bin/../lib/log4j-1.2.16.jar:/usr/lib/hbase/bin/../lib/netty-3.2.4.Final.jar:/usr/lib/hbase/bin/../lib/protobuf-java-2.3.0.jar:/usr/lib/hbase/bin/../lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hbase/bin/../lib/servlet-api-2.5.jar:/usr/lib/hbase/bin/../lib/slf4j-api-1.5.8.jar:/usr/lib/hbase/bin/../lib/slf4j-log4j12-1.5.8.jar:/usr/lib/hbase/bin/../lib/snappy-java-1.0.3.2.jar:/usr/lib/hbase/bin/../lib/stax-api-1.0.1.jar:/usr/lib/hbase/bin/../lib/thrift-0.2.0.jar:/usr/lib/hbase/bin/../lib/velocity-1.5.jar:/usr/lib/hbase/bin/../lib/xmlenc-0.52.jar:/usr/lib/hbase/bin/../lib/zookeeper.jar:/usr/lib/sqoop/sqoop-1.3.0-cdh3u2.jar:/usr/lib/sqoop/sqoop-test-1.3.0-cdh3u2.jar::/usr/lib/hadoop/hadoop-core.jar:/usr/lib/sqoop/sqoop-1.3.0-cdh3u2.jar
12/01/20 18:02:36 ERROR orm.CompilationManager: Could not rename /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/QueryResult.java to /home/cloudera/./QueryResult.java
java.io.IOException: Destination '/home/cloudera/./QueryResult.java' already exists
        at org.apache.commons.io.FileUtils.moveFile(FileUtils.java:1811)
        at com.cloudera.sqoop.orm.CompilationManager.compile(CompilationManager.java:229)
        at com.cloudera.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:85)
        at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:369)
        at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
        at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
        at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
        at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
        at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
12/01/20 18:02:36 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/QueryResult.jar
12/01/20 18:02:36 DEBUG orm.CompilationManager: Scanning for .class files in directory: /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99
12/01/20 18:02:36 DEBUG orm.CompilationManager: Got classfile: /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/QueryResult.class -> QueryResult.class
12/01/20 18:02:36 DEBUG orm.CompilationManager: Finished writing jar file /tmp/sqoop-cloudera/compile/d93e798470bd6dd21aa2d218ef8d4f99/QueryResult.jar
12/01/20 18:02:36 INFO mapreduce.ImportJobBase: Beginning query import.
12/01/20 18:02:37 DEBUG mapreduce.DataDrivenImportJob: Using table class: QueryResult
12/01/20 18:02:37 DEBUG mapreduce.DataDrivenImportJob: Using InputFormat: class com.cloudera.sqoop.mapreduce.db.DataDrivenDBInputFormat
12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath: file:/usr/lib/sqoop/sqoop-1.3.0-cdh3u2.jar
12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath: file:/usr/lib/sqoop/lib/mysql-connector-java-5.0.8-bin.jar
12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath: file:/usr/lib/sqoop/sqoop-1.3.0-cdh3u2.jar
12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath: file:/usr/lib/sqoop/sqoop-1.3.0-cdh3u2.jar
12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath: file:/usr/lib/sqoop/lib/paranamer-2.3.jar
12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath: file:/usr/lib/sqoop/lib/jackson-core-asl-1.7.3.jar
12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath: file:/usr/lib/sqoop/lib/jackson-mapper-asl-1.7.3.jar
12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath: file:/usr/lib/sqoop/lib/sqljdbc4.jar
12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath: file:/usr/lib/sqoop/lib/avro-mapred-1.5.4.jar
12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath: file:/usr/lib/sqoop/lib/ant-eclipse-1.0-jvm1.2.jar
12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath: file:/usr/lib/sqoop/lib/avro-1.5.4.jar
12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath: file:/usr/lib/sqoop/lib/avro-ipc-1.5.4.jar
12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath: file:/usr/lib/sqoop/lib/sqoop-sqlserver-1.0.jar
12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath: file:/usr/lib/sqoop/lib/commons-io-1.4.jar
12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath: file:/usr/lib/sqoop/lib/mysql-connector-java-5.0.8-bin.jar
12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath: file:/usr/lib/sqoop/lib/ivy-2.0.0-rc2.jar
12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath: file:/usr/lib/sqoop/lib/ant-contrib-1.0b3.jar
12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath: file:/usr/lib/sqoop/lib/snappy-java-1.0.3.2.jar
12/01/20 18:02:37 DEBUG mapreduce.JobBase: Adding to job classpath: file:/usr/lib/sqoop/lib/jopt-simple-3.2.jar
12/01/20 18:02:38 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(BusinessEntityID), MAX(BusinessEntityID) FROM (SELECT *, 87 AS JobID FROM SalesPerson WHERE  (1 = 1) ) AS t1
12/01/20 18:02:38 DEBUG db.IntegerSplitter: Splits: [                         274 to                          290] into 4 parts
12/01/20 18:02:38 DEBUG db.IntegerSplitter:                          274
12/01/20 18:02:38 DEBUG db.IntegerSplitter:                          278
12/01/20 18:02:38 DEBUG db.IntegerSplitter:                          282
12/01/20 18:02:38 DEBUG db.IntegerSplitter:                          286
12/01/20 18:02:38 DEBUG db.IntegerSplitter:                          290
12/01/20 18:02:39 INFO mapred.JobClient: Running job: job_201201201632_0008
12/01/20 18:02:40 INFO mapred.JobClient:  map 0% reduce 0%
12/01/20 18:02:54 INFO mapred.JobClient:  map 50% reduce 0%
12/01/20 18:02:59 INFO mapred.JobClient:  map 75% reduce 0%
12/01/20 18:03:00 INFO mapred.JobClient:  map 100% reduce 0%
12/01/20 18:03:02 INFO mapred.JobClient: Job complete: job_201201201632_0008
12/01/20 18:03:02 INFO mapred.JobClient: Counters: 12
12/01/20 18:03:02 INFO mapred.JobClient:   Job Counters 
12/01/20 18:03:02 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=28816
12/01/20 18:03:02 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
12/01/20 18:03:02 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
12/01/20 18:03:02 INFO mapred.JobClient:     Launched map tasks=4
12/01/20 18:03:02 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
12/01/20 18:03:02 INFO mapred.JobClient:   FileSystemCounters
12/01/20 18:03:02 INFO mapred.JobClient:     HDFS_BYTES_READ=505
12/01/20 18:03:02 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=270332
12/01/20 18:03:02 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=1867
12/01/20 18:03:02 INFO mapred.JobClient:   Map-Reduce Framework
12/01/20 18:03:02 INFO mapred.JobClient:     Map input records=17
12/01/20 18:03:02 INFO mapred.JobClient:     Spilled Records=0
12/01/20 18:03:02 INFO mapred.JobClient:     Map output records=17
12/01/20 18:03:02 INFO mapred.JobClient:     SPLIT_RAW_BYTES=505
12/01/20 18:03:02 INFO mapreduce.ImportJobBase: Transferred 1.8232 KB in 25.4856 seconds (73.2572 bytes/sec)
12/01/20 18:03:02 INFO mapreduce.ImportJobBase: Retrieved 17 records.
[cloudera@localhost ~]$ 

 

> From: kathleen@cloudera.com
> Date: Fri, 20 Jan 2012 14:41:32 -0800
> Subject: Re: The --hive-overwrite doesn't overwrite data
> To: sqoop-user@incubator.apache.org
> 
> Dave - to aid in debugging, please re-run your Sqoop job with the
> --verbose flag and then paste the console log.
> 
> Thanks, Kathleen
> 
> > On Fri, Jan 20, 2012 at 11:51 AM, David Langer <da...@hotmail.com> wrote:
> >> Greetings!
> >>
> >> Hopefully this isn't too much of a newbie question, but I am unable to get
> >> the --hive-overwrite argument working. I'm using sqoop 1.3.0-cdh3u2 on the
> >> Cloudera VMWare Player VM.
> >>
> >>
> >> The following sqoop invocation succeeds in creating the Hive table and
> >> populates it with data:
> >>
> >> sqoop import --connect
> >> 'jdbc:mysql://localhost/MyDB?zeroDateTimeBehavior=round' --username cloudera
> >> --query 'SELECT *, 47 AS JobID FROM SalesPerson WHERE $CONDITIONS'
> >> --split-by ID  --target-dir /tmp/SalesPerson --create-hive-table
> >> --hive-import --hive-table MyDB_SalesPerson
> >>
> >>
> >> However, while the following sqoop invocation does produce the desired data
> >> in HDFS (i.e., /tmp/SalesPerson) it does not overwrite the data in the Hive
> >> table:
> >>
> >> sqoop import --connect
> >> 'jdbc:mysql://localhost/MyDB?zeroDateTimeBehavior=round' --username cloudera
> >> --query 'SELECT *, 87 AS JobID FROM SalesPerson WHERE $CONDITIONS'
> >> --split-by ID  --target-dir /tmp/SalesPerson --hive-overwrite --hive-table
> >> MyDB_salesperson
> >>
> >>
> >> There is nothing in Hive.log that indicates the --hive-overwrite sqoop
> >> invocation is interacting with Hive (e.g., no exceptions).
> >>
> >> Any assistance would be greatly appreciated.
> >>
> >> Thanx,
> >>
> >> Dave

Re: The --hive-overwrite doesn't overwrite data

Posted by Kathleen Ting <ka...@cloudera.com>.

Dave - to aid in debugging, please re-run your Sqoop job with the
--verbose flag and then paste the console log.

Thanks, Kathleen

> On Fri, Jan 20, 2012 at 11:51 AM, David Langer <da...@hotmail.com> wrote:
>> Greetings!
>>
>> Hopefully this isn't too much of a newbie question, but I am unable to get
>> the --hive-overwrite argument working. I'm using sqoop 1.3.0-cdh3u2 on the
>> Cloudera VMWare Player VM.
>>
>>
>> The following sqoop invocation succeeds in creating the Hive table and
>> populates it with data:
>>
>> sqoop import --connect
>> 'jdbc:mysql://localhost/MyDB?zeroDateTimeBehavior=round' --username cloudera
>> --query 'SELECT *, 47 AS JobID FROM SalesPerson WHERE $CONDITIONS'
>> --split-by ID  --target-dir /tmp/SalesPerson --create-hive-table
>> --hive-import --hive-table MyDB_SalesPerson
>>
>>
>> However, while the following sqoop invocation does produce the desired data
>> in HDFS (i.e., /tmp/SalesPerson) it does not overwrite the data in the Hive
>> table:
>>
>> sqoop import --connect
>> 'jdbc:mysql://localhost/MyDB?zeroDateTimeBehavior=round' --username cloudera
>> --query 'SELECT *, 87 AS JobID FROM SalesPerson WHERE $CONDITIONS'
>> --split-by ID  --target-dir /tmp/SalesPerson --hive-overwrite --hive-table
>> MyDB_salesperson
>>
>>
>> There is nothing in Hive.log that indicates the --hive-overwrite sqoop
>> invocation is interacting with Hive (e.g., no exceptions).
>>
>> Any assistance would be greatly appreciated.
>>
>> Thanx,
>>
>> Dave