You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@sqoop.apache.org by Harpreet Singh <hs...@gmail.com> on 2017/08/02 11:00:45 UTC

Sqoop job to import data failing due to physical memory breach

Hi All,
I have a sqoop job which is running in production and fails sometimes.
Restart of job executes successfully .
Logs show that failure happens with error that container is running beyond
physical memory limits. Current usage 2.3 GB of 2GB physical memory used.
4.0 GB of 4.2 GB virtual memory used. Killing container.
Environment is
Cdh5.8.3
Sqoop 1 client
Mapreduce.map.Java.opts=-Djava.net.preferIPv4Stack=true -Xmx1717986918
Mapreduce.map.memory.MB= 2GB

Sqoop job details. Pulling data from netezza using 6 mappers and putting
into parquet format on hdfs. Data processed is 14 GB. Splits seem to be
even.
Please provide your insights.

Regards
Harpreet Singh

Re: Sqoop job to import data failing due to physical memory breach

Posted by Douglas Spadotto <do...@gmail.com>.

Hi Harpreet,

Try to give more resources to the mappers, or increase the number of
mappers. I don't think there is a direct relation between the sum of all
the mappers' JVM sizes and the input size.

Regards,

Douglas

On Thu, Aug 3, 2017 at 4:26 AM, Harpreet Singh <hs...@gmail.com> wrote:

> Thanks Douglas,
> Details asked are
> Yarn.scheduler. minimum-allocation-mb=2gb
> Yarn.scheduler. maximum-allocation-mb=128gb
> Increment=512 MB
>
> Please help with design considerations about how many mappers should be
> used for sqoop. I believe that mapper memory is capped so does thus mean
> that data to be fetched with 6 mappers using 2gb memory is capped around 12
> GB. Cluster is precisely following number of mappers specified and not
> exceeding the task count.
>
> Regards
> Harpreet Singh
>
> On Aug 2, 2017 7:19 PM, "Douglas Spadotto" <do...@gmail.com> wrote:
>
> Hello Harpreet,
>
> It seems that your job is going beyond the limits established.
>
> What are the values for yarn.scheduler.minimum-allocation-mb and
> yarn.scheduler.maximum-allocation-mb on your cluster?
>
> Some background on the meaning of these configurations can be found here:
> https://discuss.pivotal.io/hc/en-us/articles/201462036
> -MapReduce-YARN-Memory-Parameters
>
> Regards,
>
> Douglas
>
> On Wed, Aug 2, 2017 at 8:00 AM, Harpreet Singh <hs...@gmail.com>
> wrote:
>
>> Hi All,
>> I have a sqoop job which is running in production and fails sometimes.
>> Restart of job executes successfully .
>> Logs show that failure happens with error that container is running
>> beyond physical memory limits. Current usage 2.3 GB of 2GB physical memory
>> used. 4.0 GB of 4.2 GB virtual memory used. Killing container.
>> Environment is
>> Cdh5.8.3
>> Sqoop 1 client
>> Mapreduce.map.Java.opts=-Djava.net.preferIPv4Stack=true -Xmx1717986918
>> Mapreduce.map.memory.MB= 2GB
>>
>> Sqoop job details. Pulling data from netezza using 6 mappers and putting
>> into parquet format on hdfs. Data processed is 14 GB. Splits seem to be
>> even.
>> Please provide your insights.
>>
>> Regards
>> Harpreet Singh
>>
>
>
>

Re: Sqoop job to import data failing due to physical memory breach

Posted by Harpreet Singh <hs...@gmail.com>.

Thanks Douglas,
Details asked are
Yarn.scheduler. minimum-allocation-mb=2gb
Yarn.scheduler. maximum-allocation-mb=128gb
Increment=512 MB

Please help with design considerations about how many mappers should be
used for sqoop. I believe that mapper memory is capped so does thus mean
that data to be fetched with 6 mappers using 2gb memory is capped around 12
GB. Cluster is precisely following number of mappers specified and not
exceeding the task count.

Regards
Harpreet Singh

On Aug 2, 2017 7:19 PM, "Douglas Spadotto" <do...@gmail.com> wrote:

Hello Harpreet,

It seems that your job is going beyond the limits established.

What are the values for yarn.scheduler.minimum-allocation-mb and
yarn.scheduler.maximum-allocation-mb on your cluster?

Some background on the meaning of these configurations can be found here:
https://discuss.pivotal.io/hc/en-us/articles/201462036-MapReduce-YARN-
Memory-Parameters

Regards,

Douglas

On Wed, Aug 2, 2017 at 8:00 AM, Harpreet Singh <hs...@gmail.com> wrote:

> Hi All,
> I have a sqoop job which is running in production and fails sometimes.
> Restart of job executes successfully .
> Logs show that failure happens with error that container is running beyond
> physical memory limits. Current usage 2.3 GB of 2GB physical memory used.
> 4.0 GB of 4.2 GB virtual memory used. Killing container.
> Environment is
> Cdh5.8.3
> Sqoop 1 client
> Mapreduce.map.Java.opts=-Djava.net.preferIPv4Stack=true -Xmx1717986918
> Mapreduce.map.memory.MB= 2GB
>
> Sqoop job details. Pulling data from netezza using 6 mappers and putting
> into parquet format on hdfs. Data processed is 14 GB. Splits seem to be
> even.
> Please provide your insights.
>
> Regards
> Harpreet Singh
>

Re: Sqoop job to import data failing due to physical memory breach

Posted by Douglas Spadotto <do...@gmail.com>.

Hello Harpreet,

It seems that your job is going beyond the limits established.

What are the values for yarn.scheduler.minimum-allocation-mb and
yarn.scheduler.maximum-allocation-mb
on your cluster?

Some background on the meaning of these configurations can be found here:
https://discuss.pivotal.io/hc/en-us/articles/201462036-MapReduce-YARN-Memory-Parameters

Regards,

Douglas

On Wed, Aug 2, 2017 at 8:00 AM, Harpreet Singh <hs...@gmail.com> wrote:

> Hi All,
> I have a sqoop job which is running in production and fails sometimes.
> Restart of job executes successfully .
> Logs show that failure happens with error that container is running beyond
> physical memory limits. Current usage 2.3 GB of 2GB physical memory used.
> 4.0 GB of 4.2 GB virtual memory used. Killing container.
> Environment is
> Cdh5.8.3
> Sqoop 1 client
> Mapreduce.map.Java.opts=-Djava.net.preferIPv4Stack=true -Xmx1717986918
> Mapreduce.map.memory.MB= 2GB
>
> Sqoop job details. Pulling data from netezza using 6 mappers and putting
> into parquet format on hdfs. Data processed is 14 GB. Splits seem to be
> even.
> Please provide your insights.
>
> Regards
> Harpreet Singh
>