You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@sqoop.apache.org by Mark Libucha <ml...@gmail.com> on 2016/06/09 15:51:16 UTC

oraoop import OutOfMemoryError

Hi, I can’t keep from running out of JVM heap when trying to import a large
Oracle table with the direct flag set. I can import successfully with
smaller tables.



Stack trace in the mapper log shows:



2016-06-09 14:59:21,266 FATAL [main] org.apache.hadoop.mapred.YarnChild:
Error running child : java.lang.OutOfMemoryError: Java heap space



and (the subsequent and probably irrelevant?)



Caused by: java.sql.SQLException: Protocol violation: [8, 1]



The line that gets printed to stdout just before the job runs:



16/06/09 15:21:57 INFO oracle.OraOopDataDrivenDBInputFormat: The table
being imported by sqoop has 80751872 blocks that have been divided into
5562 chunks which will be processed in 16 splits. The chunks will be
allocated to the splits using the method : ROUNDROBIN



I’ve tried adding this to the command: -Dmapred.child.java.opts=-Xmx4000M
but that doesn’t help. I've also tried increasing/decreasing the number of
splits.



The full command looks like this:



sqoop import -Dmapred.child.java.opts=-Xmx4000M
-Dmapred.map.max.attempts=1 --connect
jdbc:oracle:thin:@ldap://myhost:389/somedb,cn=OracleContext,dc=mycom,dc=com
--username myusername --password mypassword --table mydb.mytable --columns
"COL1, COL2, COL50" --hive-partition-key "ds" --hive-partition-value
"20160607" --hive-database myhivedb --hive-table myhivetable --hive-import
--null-string "" --null-non-string "" --direct --create-hive-table -m 16
--delete-target-dir --target-dir /tmp/sqoop_test



Thanks for any suggestions.



Mark

Re: oraoop import OutOfMemoryError

Posted by Mark Libucha <ml...@gmail.com>.
I solved this problem. It was a simple fix.

-Doracle.row.fetch.size=1000

The default was 5000, and since I was grabbing a lot of columns it was
taking up more memory than the JVM could handle. I had been using the
--fetch-size option, but that wasn't helping.

On Thu, Jun 9, 2016 at 8:51 AM, Mark Libucha <ml...@gmail.com> wrote:

> Hi, I can’t keep from running out of JVM heap when trying to import a
> large Oracle table with the direct flag set. I can import successfully with
> smaller tables.
>
>
>
> Stack trace in the mapper log shows:
>
>
>
> 2016-06-09 14:59:21,266 FATAL [main] org.apache.hadoop.mapred.YarnChild:
> Error running child : java.lang.OutOfMemoryError: Java heap space
>
>
>
> and (the subsequent and probably irrelevant?)
>
>
>
> Caused by: java.sql.SQLException: Protocol violation: [8, 1]
>
>
>
> The line that gets printed to stdout just before the job runs:
>
>
>
> 16/06/09 15:21:57 INFO oracle.OraOopDataDrivenDBInputFormat: The table
> being imported by sqoop has 80751872 blocks that have been divided into
> 5562 chunks which will be processed in 16 splits. The chunks will be
> allocated to the splits using the method : ROUNDROBIN
>
>
>
> I’ve tried adding this to the command: -Dmapred.child.java.opts=-Xmx4000M
> but that doesn’t help. I've also tried increasing/decreasing the number of
> splits.
>
>
>
> The full command looks like this:
>
>
>
> sqoop import -Dmapred.child.java.opts=-Xmx4000M
> -Dmapred.map.max.attempts=1 --connect jdbc:oracle:thin:@ldap://myhost:389/somedb,cn=OracleContext,dc=mycom,dc=com
> --username myusername --password mypassword --table mydb.mytable --columns
> "COL1, COL2, COL50" --hive-partition-key "ds" --hive-partition-value
> "20160607" --hive-database myhivedb --hive-table myhivetable --hive-import
> --null-string "" --null-non-string "" --direct --create-hive-table -m 16
> --delete-target-dir --target-dir /tmp/sqoop_test
>
>
>
> Thanks for any suggestions.
>
>
>
> Mark
>