You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@sqoop.apache.org by Michal Vince <vi...@gmail.com> on 2016/03/14 15:43:43 UTC

sqoop2 yarn container is running beyond physical memory limits

Hi guys
I`m trying to get a grip on sqoop2. I`m running hadoop2 cluster with  2 
nodes, for yarn there is 28GB od memory, 24 cores and 4 disks available
minimum allocation resources for yarn container are 1024 of ram, 1 core 
and 0. disks

I`m tryng to dump my relatively large table to hdfs - 25M rows, 33 
columns, stored in maria DB with tokuDB engine and using sqoops generic 
jdbc driver

every time I try to run job in sqoop2 I`m getting

2016-03-14 13:07:29,427 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1457009691885_0029_m_000004_0: Container [pid=6536,containerID=container_e09_1457009691885_0029_01_000008] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 2.6 GB of 2.1 GB virtual memory used. Killing container.


I tried to use different number of executors from 5 to 10k but with no luck

It looks like to me sqoop is allocating minimum resources for container, 
is there any way how to configure sqoop to allocate more memory for this 
job? Or the only way is to change yarn settings?

thanks a lot





Re: sqoop2 yarn container is running beyond physical memory limits

Posted by Michal Vince <vi...@gmail.com>.
Hi Abe

Yes, I this solves my problem.

Thank you very much for the response
Michal


On 03/14/2016 10:31 PM, Abraham Fine wrote:
> Hi Michal-
>
> Currently there is no “official” mechanism for setting yarn settings within Sqoop 2.
>
> If you are comfortable changing the yarn settings for all jobs launched by this sqoop server, sqoop loads all -site.xml files for configuration. So you could add another file that specifies the memory settings that you want if you do not want to change them cluster wide.
>
> Let me know if this solves your problem.
>
> Thanks,
> Abe
>
>> On Mar 14, 2016, at 7:43 AM, Michal Vince <vi...@gmail.com> wrote:
>>
>> Hi guys
>> I`m trying to get a grip on sqoop2. I`m running hadoop2 cluster with  2 nodes, for yarn there is 28GB od memory, 24 cores and 4 disks available
>> minimum allocation resources for yarn container are 1024 of ram, 1 core and 0. disks
>>
>> I`m tryng to dump my relatively large table to hdfs - 25M rows, 33 columns, stored in maria DB with tokuDB engine and using sqoops generic jdbc driver
>>
>> every time I try to run job in sqoop2 I`m getting
>>
>> 2016-03-14 13:07:29,427 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1457009691885_0029_m_000004_0: Container [pid=6536,containerID=container_e09_1457009691885_0029_01_000008] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 2.6 GB of 2.1 GB virtual memory used. Killing container.
>>
>>
>> I tried to use different number of executors from 5 to 10k but with no luck
>>
>> It looks like to me sqoop is allocating minimum resources for container, is there any way how to configure sqoop to allocate more memory for this job? Or the only way is to change yarn settings?
>>
>> thanks a lot
>>
>>
>>
>>


Re: sqoop2 yarn container is running beyond physical memory limits

Posted by Abraham Fine <ab...@abrahamfine.com>.
Hi Michal-

Currently there is no “official” mechanism for setting yarn settings within Sqoop 2.

If you are comfortable changing the yarn settings for all jobs launched by this sqoop server, sqoop loads all -site.xml files for configuration. So you could add another file that specifies the memory settings that you want if you do not want to change them cluster wide.

Let me know if this solves your problem.

Thanks,
Abe

> On Mar 14, 2016, at 7:43 AM, Michal Vince <vi...@gmail.com> wrote:
> 
> Hi guys
> I`m trying to get a grip on sqoop2. I`m running hadoop2 cluster with  2 nodes, for yarn there is 28GB od memory, 24 cores and 4 disks available
> minimum allocation resources for yarn container are 1024 of ram, 1 core and 0. disks
> 
> I`m tryng to dump my relatively large table to hdfs - 25M rows, 33 columns, stored in maria DB with tokuDB engine and using sqoops generic jdbc driver
> 
> every time I try to run job in sqoop2 I`m getting
> 
> 2016-03-14 13:07:29,427 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1457009691885_0029_m_000004_0: Container [pid=6536,containerID=container_e09_1457009691885_0029_01_000008] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 2.6 GB of 2.1 GB virtual memory used. Killing container.
> 
> 
> I tried to use different number of executors from 5 to 10k but with no luck
> 
> It looks like to me sqoop is allocating minimum resources for container, is there any way how to configure sqoop to allocate more memory for this job? Or the only way is to change yarn settings?
> 
> thanks a lot
> 
> 
> 
>