You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Sunita Arvind <su...@gmail.com> on 2013/11/11 22:48:31 UTC

Seeking Help configuring log4j for sqoop import into hive

Hello,

I am using sqoop to import data from oracle into hive. Below is my SQL:

nohup sqoop import --connect "jdbc:oracle:thin:@(DESCRIPTION = (ADDRESS =
(PROTOCOL = TCP)(HOST = xxxxxxx)(PORT = xxxx)) (CONNECT_DATA = (SERVER =
DEDICATED) (SERVICE_NAME = CDWQ.tms.toyota.com) (FAILOVER_MODE=
(TYPE=select) (METHOD=basic))))"  --username "xxxx"  --password "xxxx"
--split-by employeeid --query  "SELECT e.employeeid,p.salary from employee
e, payroll p
where e.employeeid =p.employeeid and $CONDITIONS"       --create-hive-table
 --hive-table "EMPLOYEE" --hive-import --target-dir
"/user/hive/warehouse/employee" --direct --verbose


Note: This is production data hence I cannot share the log file or actual
query. Sorry for that.

Similar query works for some tables and for this particular table, there is
an exception as below:

java.io.IOException: SQLException in nextKeyValue
        at
org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:266)
        at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
        at
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
        at
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
        at
org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
        at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.sql
attempt_201311071517_0011_m_000003_0: log4j:WARN No appenders could be
found for logger (org.apache.hadoop.hdfs.DFSClient).
attempt_201311071517_0011_m_000003_0: log4j:WARN Please initialize the
log4j system properly.
attempt_201311071517_0011_m_000003_0: log4j:WARN See
http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
13/11/11 11:35:20 INFO mapred.JobClient: Task Id :
attempt_201311071517_0011_m_000000_0, Status : FAILED


I eye-balled to see date format issues, which is typically the root cause
for such issues, as per forums. But that does not seem to be the case here
(I could be wrong). I also added the "-direct" option as suggested by some
posts and that did not help either.

The actual exception after the "caused by" is missing, which makes me
believe that sqoop is trying to redirect the output to some log file and it
does not find the necessary configurations. Hence it is not dumping the
actual stacktrace.

*Seeking help from the community to understand how to configure sqoop to
display the complete stacktrace ?*

I looked at the log4j.properties in the environment but did not find
anything specific to sqoop:
./etc/cloudera-scm-server/log4j.properties
./etc/hadoop/conf.cloudera.mapreduce1/log4j.properties
./etc/hadoop/conf.cloudera.hdfs1/log4j.properties
./etc/hadoop/conf.empty/log4j.properties
./etc/hadoop-0.20/conf.cloudera.mapreduce1/log4j.properties
./etc/hadoop-0.20/conf.cloudera.hdfs1/log4j.properties
./etc/hue/log4j.properties
./etc/hbase/conf.dist/log4j.properties
./etc/zookeeper/conf.dist/log4j.properties
./etc/pig/conf.dist/log4j.properties
./var/run/cloudera-scm-agent/process/303-mapreduce-TASKTRACKER/log4j.properties
./var/run/cloudera-scm-agent/process/321-hdfs-SECONDARYNAMENODE/log4j.properties
./var/run/cloudera-scm-agent/process/311-hue-BEESWAX_SERVER/hadoop-conf/log4j.properties
./var/run/cloudera-scm-agent/process/307-oozie-OOZIE_SERVER/hadoop-conf/log4j.properties
./var/run/cloudera-scm-agent/process/307-oozie-OOZIE_SERVER/log4j.properties
./var/run/cloudera-scm-agent/process/315-impala-IMPALAD/impala-conf/log4j.properties
./var/run/cloudera-scm-agent/process/308-hive-HIVEMETASTORE/hadoop-conf/log4j.properties

regards
Sunita

Re: Seeking Help configuring log4j for sqoop import into hive

Posted by Sunita Arvind <su...@gmail.com>.

Thanks Jarcec. You are right the map task log is what I am looking for but
I don't see it. The warning just below the 'caused by' line indicates its
not set up right.

I'll redirect the query to sqoop user group. Please let me know if you have
any pointers to get the map task log.

Sunita

On Monday, November 11, 2013, Jarek Jarcec Cecho wrote:

> Hi Sunita,
> Sqoop specific questions are better asked on Sqoop user mailing list
> user@sqoop.apache.org <javascript:;>. You can find instructions how to
> subscribe to that at [1].
>
> I would suggest to take a look into the failed map task log as that log
> usually contain entire exception including all the chained exceptions.
>
> Jarcec
>
> Links:
> 1: http://sqoop.apache.org/mail-lists.html
>
> On Mon, Nov 11, 2013 at 03:01:22PM -0800, Sunita Arvind wrote:
> > Just in case this acts as a workaround for someone:
> > The issue is resolved if I eliminate the "where" clause in the query
> (just
> > keep "where $CONDITIONS"). So 2 workarounds I can think of now are:
> > 1. Create views in Oracle and query without the where clause in the sqoop
> > import command
> > 2. Import everything in the table (not feasible in most cases)
> >
> > However, I still need to know how to get the exact stack trace.
> >
> > regards
> > Sunita
> >
> >
> > On Mon, Nov 11, 2013 at 1:48 PM, Sunita Arvind <sunitarvind@gmail.com
> >wrote:
> >
> > > Hello,
> > >
> > > I am using sqoop to import data from oracle into hive. Below is my SQL:
> > >
> > > nohup sqoop import --connect "jdbc:oracle:thin:@(DESCRIPTION =
> (ADDRESS =
> > > (PROTOCOL = TCP)(HOST = xxxxxxx)(PORT = xxxx)) (CONNECT_DATA = (SERVER
> =
> > > DEDICATED) (SERVICE_NAME = CDWQ.tms.toyota.com) (FAILOVER_MODE=
> > > (TYPE=select) (METHOD=basic))))"  --username "xxxx"  --password "xxxx"
> > > --split-by employeeid --query  "SELECT e.employeeid,p.salary from
> employee
> > > e, payroll p
> > > where e.employeeid =p.employeeid and $CONDITIONS"
> > > --create-hive-table  --hive-table "EMPLOYEE" --hive-import --target-dir
> > > "/user/hive/warehouse/employee" --direct --verbose
> > >
> > >
> > > Note: This is production data hence I cannot share the log file or
> actual
> > > query. Sorry for that.
> > >
> > > Similar query works for some tables and for this particular table,
> there
> > > is an exception as below:
> > >
> > > java.io.IOException: SQLException in nextKeyValue
> > >         at
> > >
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:266)
> > >         at
> > >
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
> > >         at
> > >
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
> > >         at
> > >
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
> > >         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
> > >         at
> > >
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
> > >         at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
> > >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
> > >         at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
> > >         at java.security.AccessController.doPrivileged(Native Method)
> > >         at javax.security.auth.Subject.doAs(Subject.java:396)
> > >         at
> > >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> > >         at org.apache.hadoop.mapred.Child.main(Child.java:262)
> > > Caused by: java.sql
> > > attempt_201311071517_0011_m_000003_0: log4j:WARN No appenders could be
> > > found for logger (org.apache.hadoop.hdfs.DFSClient).
> > > attempt_201311071517_0011_m_000003_0: log4j:WARN Please initialize the
> > > log4j system properly.
> > > attempt_201311071517_0011_m_000003_0: log4j:WARN See
> > > http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
> > > 13/11/11 11:35:20 INFO mapred.JobClient: Task Id :
> > > attempt_201311071517_0011_m_000000_0, Status : FAILED
> > >
> > >
> > > I eye-balled to see date format issues, which is typically the root
> cause
> > > for such issues, as per forums. But that does not seem to be the case
> here
> > > (I could be wrong). I also added the "-direct" option as suggested by
> some
> > > posts and that did not help either.
> > >
> > > *Seeking help from the community to understand how to configure sqoop
> to
> > > display the complete stacktrace ?*
> > >
> > > I looked at the log4j.properties in the environment but did not find
> > > anything specific to sqoop:
> > > ./etc/cloudera-scm-server/log4j.properties
> > > ./etc/hadoop/conf.cloudera.mapreduce1/log4j.properties
> > > ./etc/hadoop/conf.cloudera.hdfs1/log4j.properties
> > > ./etc/hadoop/conf.empty/log4j.properties
> > > ./etc/hadoop-0.20/conf.cloudera.mapreduce1/log4j.properties
> > > ./etc/hadoop-0.20/conf.cloudera.hdfs1/log4j.properties
> > > ./etc/hue/log4j.properties
> > > ./etc/hbase/conf.dist/log4j.properties
> > > ./etc/zookeeper/conf.dist/log4j.properties
> > > ./etc/pig/conf.dist/log4j.properties
> > >
> > >
> ./var/run/cloudera-scm-agent/process/303-mapreduce-TASKTRACKER/log4j.properties
> > >
> > >
> ./var/run/cloudera-scm-agent/process/321-hdfs-SECONDARYNAMENODE/log4j.properties
> > >
> > >
> ./var/run/cloudera-scm-agent/process/311-hue-BEESWAX_SERVER/hadoop-conf/log4j.properties
> > >
> > >
> ./var/run/cloudera-scm-agent/process/307-oozie-OOZIE_SERVER/hadoop-conf/log4j.properties
> > >
> > >
> ./var/run/cloudera-scm-agent/process/307-oozie-OOZIE_SERVER/log4j.properties
> > >
> > >
> ./var/run/cloudera-scm-agent/process/315-impala-IMPALAD/impala-conf/log4j.properties
> > >
> > >
> ./var/run/cloudera-scm-agent/process/308-hive-HIVEMETASTORE/hadoop-conf/log4j.properties
> > >
> > > regards
> > > Sunita
> > >
>

Re: Seeking Help configuring log4j for sqoop import into hive

Posted by Jarek Jarcec Cecho <ja...@apache.org>.

Hi Sunita,
Sqoop specific questions are better asked on Sqoop user mailing list user@sqoop.apache.org. You can find instructions how to subscribe to that at [1].

I would suggest to take a look into the failed map task log as that log usually contain entire exception including all the chained exceptions.

Jarcec

Links:
1: http://sqoop.apache.org/mail-lists.html

On Mon, Nov 11, 2013 at 03:01:22PM -0800, Sunita Arvind wrote:
> Just in case this acts as a workaround for someone:
> The issue is resolved if I eliminate the "where" clause in the query (just
> keep "where $CONDITIONS"). So 2 workarounds I can think of now are:
> 1. Create views in Oracle and query without the where clause in the sqoop
> import command
> 2. Import everything in the table (not feasible in most cases)
> 
> However, I still need to know how to get the exact stack trace.
> 
> regards
> Sunita
> 
> 
> On Mon, Nov 11, 2013 at 1:48 PM, Sunita Arvind <su...@gmail.com>wrote:
> 
> > Hello,
> >
> > I am using sqoop to import data from oracle into hive. Below is my SQL:
> >
> > nohup sqoop import --connect "jdbc:oracle:thin:@(DESCRIPTION = (ADDRESS =
> > (PROTOCOL = TCP)(HOST = xxxxxxx)(PORT = xxxx)) (CONNECT_DATA = (SERVER =
> > DEDICATED) (SERVICE_NAME = CDWQ.tms.toyota.com) (FAILOVER_MODE=
> > (TYPE=select) (METHOD=basic))))"  --username "xxxx"  --password "xxxx"
> > --split-by employeeid --query  "SELECT e.employeeid,p.salary from employee
> > e, payroll p
> > where e.employeeid =p.employeeid and $CONDITIONS"
> > --create-hive-table  --hive-table "EMPLOYEE" --hive-import --target-dir
> > "/user/hive/warehouse/employee" --direct --verbose
> >
> >
> > Note: This is production data hence I cannot share the log file or actual
> > query. Sorry for that.
> >
> > Similar query works for some tables and for this particular table, there
> > is an exception as below:
> >
> > java.io.IOException: SQLException in nextKeyValue
> >         at
> > org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:266)
> >         at
> > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
> >         at
> > org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
> >         at
> > org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
> >         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
> >         at
> > org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
> >         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at javax.security.auth.Subject.doAs(Subject.java:396)
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> >         at org.apache.hadoop.mapred.Child.main(Child.java:262)
> > Caused by: java.sql
> > attempt_201311071517_0011_m_000003_0: log4j:WARN No appenders could be
> > found for logger (org.apache.hadoop.hdfs.DFSClient).
> > attempt_201311071517_0011_m_000003_0: log4j:WARN Please initialize the
> > log4j system properly.
> > attempt_201311071517_0011_m_000003_0: log4j:WARN See
> > http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
> > 13/11/11 11:35:20 INFO mapred.JobClient: Task Id :
> > attempt_201311071517_0011_m_000000_0, Status : FAILED
> >
> >
> > I eye-balled to see date format issues, which is typically the root cause
> > for such issues, as per forums. But that does not seem to be the case here
> > (I could be wrong). I also added the "-direct" option as suggested by some
> > posts and that did not help either.
> >
> > The actual exception after the "caused by" is missing, which makes me
> > believe that sqoop is trying to redirect the output to some log file and it
> > does not find the necessary configurations. Hence it is not dumping the
> > actual stacktrace.
> >
> > *Seeking help from the community to understand how to configure sqoop to
> > display the complete stacktrace ?*
> >
> > I looked at the log4j.properties in the environment but did not find
> > anything specific to sqoop:
> > ./etc/cloudera-scm-server/log4j.properties
> > ./etc/hadoop/conf.cloudera.mapreduce1/log4j.properties
> > ./etc/hadoop/conf.cloudera.hdfs1/log4j.properties
> > ./etc/hadoop/conf.empty/log4j.properties
> > ./etc/hadoop-0.20/conf.cloudera.mapreduce1/log4j.properties
> > ./etc/hadoop-0.20/conf.cloudera.hdfs1/log4j.properties
> > ./etc/hue/log4j.properties
> > ./etc/hbase/conf.dist/log4j.properties
> > ./etc/zookeeper/conf.dist/log4j.properties
> > ./etc/pig/conf.dist/log4j.properties
> >
> > ./var/run/cloudera-scm-agent/process/303-mapreduce-TASKTRACKER/log4j.properties
> >
> > ./var/run/cloudera-scm-agent/process/321-hdfs-SECONDARYNAMENODE/log4j.properties
> >
> > ./var/run/cloudera-scm-agent/process/311-hue-BEESWAX_SERVER/hadoop-conf/log4j.properties
> >
> > ./var/run/cloudera-scm-agent/process/307-oozie-OOZIE_SERVER/hadoop-conf/log4j.properties
> >
> > ./var/run/cloudera-scm-agent/process/307-oozie-OOZIE_SERVER/log4j.properties
> >
> > ./var/run/cloudera-scm-agent/process/315-impala-IMPALAD/impala-conf/log4j.properties
> >
> > ./var/run/cloudera-scm-agent/process/308-hive-HIVEMETASTORE/hadoop-conf/log4j.properties
> >
> > regards
> > Sunita
> >

Re: Seeking Help configuring log4j for sqoop import into hive

Posted by Sunita Arvind <su...@gmail.com>.

Thanks David,

Very valuable input. Will update the group with my findings.

Regards
Sunita

On Monday, November 11, 2013, David Morel wrote:

> On 12 Nov 2013, at 0:01, Sunita Arvind wrote:
>
>  Just in case this acts as a workaround for someone:
>> The issue is resolved if I eliminate the "where" clause in the query (just
>> keep "where $CONDITIONS"). So 2 workarounds I can think of now are:
>> 1. Create views in Oracle and query without the where clause in the sqoop
>> import command
>> 2. Import everything in the table (not feasible in most cases)
>>
>> However, I still need to know how to get the exact stack trace.
>>
>> regards
>> Sunita
>>
>>
>> On Mon, Nov 11, 2013 at 1:48 PM, Sunita Arvind <sunitarvind@gmail.com
>> >wrote:
>>
>>  Hello,
>>>
>>> I am using sqoop to import data from oracle into hive. Below is my SQL:
>>>
>>> nohup sqoop import --connect "jdbc:oracle:thin:@(DESCRIPTION = (ADDRESS
>>> =
>>> (PROTOCOL = TCP)(HOST = xxxxxxx)(PORT = xxxx)) (CONNECT_DATA = (SERVER =
>>> DEDICATED) (SERVICE_NAME = CDWQ.tms.toyota.com) (FAILOVER_MODE=
>>> (TYPE=select) (METHOD=basic))))"  --username "xxxx"  --password "xxxx"
>>> --split-by employeeid --query  "SELECT e.employeeid,p.salary from
>>> employee
>>> e, payroll p
>>> where e.employeeid =p.employeeid and $CONDITIONS"
>>> --create-hive-table  --hive-table "EMPLOYEE" --hive-import --target-dir
>>> "/user/hive/warehouse/employee" --direct --verbose
>>>
>>>
>>> Note: This is production data hence I cannot share the log file or actual
>>> query. Sorry for that.
>>>
>>> Similar query works for some tables and for this particular table, there
>>> is an exception as below:
>>>
>>> java.io.IOException: SQLException in nextKeyValue
>>>      at
>>> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(
>>> DBRecordReader.java:266)
>>>      at
>>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.
>>> nextKeyValue(MapTask.java:484)
>>>      at
>>> org.apache.hadoop.mapreduce.task.MapContextImpl.
>>> nextKeyValue(MapContextImpl.java:76)
>>>      at
>>> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.
>>> nextKeyValue(WrappedMapper.java:85)
>>>      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
>>>      at
>>> org.apache.sqoop.mapreduce.AutoProgressMapper.run(
>>> AutoProgressMapper.java:64)
>>>      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
>>>      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
>>>      at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>>>      at java.security.AccessController.doPrivileged(Native Method)
>>>      at javax.security.auth.Subject.doAs(Subject.java:396)
>>>      at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(
>>> UserGroupInformation.java:1408)
>>>      at org.apache.hadoop.mapred.Child.main(Child.java:262)
>>>
>>
>
> This is usually the case when your PK (on which Sqoop will try to do the
> split) isn't an integer.
>
> my 2c.
>
> David
>

Re: Seeking Help configuring log4j for sqoop import into hive

Posted by David Morel <dm...@gmail.com>.

On 12 Nov 2013, at 0:01, Sunita Arvind wrote:

> Just in case this acts as a workaround for someone:
> The issue is resolved if I eliminate the "where" clause in the query 
> (just
> keep "where $CONDITIONS"). So 2 workarounds I can think of now are:
> 1. Create views in Oracle and query without the where clause in the 
> sqoop
> import command
> 2. Import everything in the table (not feasible in most cases)
>
> However, I still need to know how to get the exact stack trace.
>
> regards
> Sunita
>
>
> On Mon, Nov 11, 2013 at 1:48 PM, Sunita Arvind 
> <su...@gmail.com>wrote:
>
>> Hello,
>>
>> I am using sqoop to import data from oracle into hive. Below is my 
>> SQL:
>>
>> nohup sqoop import --connect "jdbc:oracle:thin:@(DESCRIPTION = 
>> (ADDRESS =
>> (PROTOCOL = TCP)(HOST = xxxxxxx)(PORT = xxxx)) (CONNECT_DATA = 
>> (SERVER =
>> DEDICATED) (SERVICE_NAME = CDWQ.tms.toyota.com) (FAILOVER_MODE=
>> (TYPE=select) (METHOD=basic))))"  --username "xxxx"  --password 
>> "xxxx"
>> --split-by employeeid --query  "SELECT e.employeeid,p.salary from 
>> employee
>> e, payroll p
>> where e.employeeid =p.employeeid and $CONDITIONS"
>> --create-hive-table  --hive-table "EMPLOYEE" --hive-import 
>> --target-dir
>> "/user/hive/warehouse/employee" --direct --verbose
>>
>>
>> Note: This is production data hence I cannot share the log file or 
>> actual
>> query. Sorry for that.
>>
>> Similar query works for some tables and for this particular table, 
>> there
>> is an exception as below:
>>
>> java.io.IOException: SQLException in nextKeyValue
>>      at
>> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:266)
>>      at
>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
>>      at
>> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
>>      at
>> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
>>      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
>>      at
>> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>>      at 
>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
>>      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
>>      at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>>      at java.security.AccessController.doPrivileged(Native Method)
>>      at javax.security.auth.Subject.doAs(Subject.java:396)
>>      at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>>      at org.apache.hadoop.mapred.Child.main(Child.java:262)


This is usually the case when your PK (on which Sqoop will try to do the 
split) isn't an integer.

my 2c.

David

Re: Seeking Help configuring log4j for sqoop import into hive

Posted by Sunita Arvind <su...@gmail.com>.

Just in case this acts as a workaround for someone:
The issue is resolved if I eliminate the "where" clause in the query (just
keep "where $CONDITIONS"). So 2 workarounds I can think of now are:
1. Create views in Oracle and query without the where clause in the sqoop
import command
2. Import everything in the table (not feasible in most cases)

However, I still need to know how to get the exact stack trace.

regards
Sunita


On Mon, Nov 11, 2013 at 1:48 PM, Sunita Arvind <su...@gmail.com>wrote:

> Hello,
>
> I am using sqoop to import data from oracle into hive. Below is my SQL:
>
> nohup sqoop import --connect "jdbc:oracle:thin:@(DESCRIPTION = (ADDRESS =
> (PROTOCOL = TCP)(HOST = xxxxxxx)(PORT = xxxx)) (CONNECT_DATA = (SERVER =
> DEDICATED) (SERVICE_NAME = CDWQ.tms.toyota.com) (FAILOVER_MODE=
> (TYPE=select) (METHOD=basic))))"  --username "xxxx"  --password "xxxx"
> --split-by employeeid --query  "SELECT e.employeeid,p.salary from employee
> e, payroll p
> where e.employeeid =p.employeeid and $CONDITIONS"
> --create-hive-table  --hive-table "EMPLOYEE" --hive-import --target-dir
> "/user/hive/warehouse/employee" --direct --verbose
>
>
> Note: This is production data hence I cannot share the log file or actual
> query. Sorry for that.
>
> Similar query works for some tables and for this particular table, there
> is an exception as below:
>
> java.io.IOException: SQLException in nextKeyValue
>         at
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:266)
>         at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
>         at
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
>         at
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
>         at
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>         at org.apache.hadoop.mapred.Child.main(Child.java:262)
> Caused by: java.sql
> attempt_201311071517_0011_m_000003_0: log4j:WARN No appenders could be
> found for logger (org.apache.hadoop.hdfs.DFSClient).
> attempt_201311071517_0011_m_000003_0: log4j:WARN Please initialize the
> log4j system properly.
> attempt_201311071517_0011_m_000003_0: log4j:WARN See
> http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
> 13/11/11 11:35:20 INFO mapred.JobClient: Task Id :
> attempt_201311071517_0011_m_000000_0, Status : FAILED
>
>
> I eye-balled to see date format issues, which is typically the root cause
> for such issues, as per forums. But that does not seem to be the case here
> (I could be wrong). I also added the "-direct" option as suggested by some
> posts and that did not help either.
>
> The actual exception after the "caused by" is missing, which makes me
> believe that sqoop is trying to redirect the output to some log file and it
> does not find the necessary configurations. Hence it is not dumping the
> actual stacktrace.
>
> *Seeking help from the community to understand how to configure sqoop to
> display the complete stacktrace ?*
>
> I looked at the log4j.properties in the environment but did not find
> anything specific to sqoop:
> ./etc/cloudera-scm-server/log4j.properties
> ./etc/hadoop/conf.cloudera.mapreduce1/log4j.properties
> ./etc/hadoop/conf.cloudera.hdfs1/log4j.properties
> ./etc/hadoop/conf.empty/log4j.properties
> ./etc/hadoop-0.20/conf.cloudera.mapreduce1/log4j.properties
> ./etc/hadoop-0.20/conf.cloudera.hdfs1/log4j.properties
> ./etc/hue/log4j.properties
> ./etc/hbase/conf.dist/log4j.properties
> ./etc/zookeeper/conf.dist/log4j.properties
> ./etc/pig/conf.dist/log4j.properties
>
> ./var/run/cloudera-scm-agent/process/303-mapreduce-TASKTRACKER/log4j.properties
>
> ./var/run/cloudera-scm-agent/process/321-hdfs-SECONDARYNAMENODE/log4j.properties
>
> ./var/run/cloudera-scm-agent/process/311-hue-BEESWAX_SERVER/hadoop-conf/log4j.properties
>
> ./var/run/cloudera-scm-agent/process/307-oozie-OOZIE_SERVER/hadoop-conf/log4j.properties
>
> ./var/run/cloudera-scm-agent/process/307-oozie-OOZIE_SERVER/log4j.properties
>
> ./var/run/cloudera-scm-agent/process/315-impala-IMPALAD/impala-conf/log4j.properties
>
> ./var/run/cloudera-scm-agent/process/308-hive-HIVEMETASTORE/hadoop-conf/log4j.properties
>
> regards
> Sunita
>