You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by "LINZ, Arnaud" <AL...@bouyguestelecom.fr> on 2016/12/09 09:35:05 UTC

OutOfMemory when looping on dataset filter

Hello,

I have a non-distributed treatment to apply to a DataSet of timed events, one day after another in a flink batch.
My algorithm is:

// wholeSet is too big to fit in RAM with a collect(), so we cut it in pieces
DataSet wholeSet = [Select WholeSet];
for (day 1 to 31) {
                List<> dayData = wholeSet.filter(day).collect();
                applyComplexNonDistributedTreatment(dayData);
}

Even if each day can perfectly fit in RAM (I’ve made a test where only the first day have data), I quickly get a OOM in a task manager at one point in the loop, so I guess that the “wholeSet” si keeped several times times in Ram.

Two questions :

1)      Is there a better way of handling it where the “select wholeset” is made only once ?

2)      Even when the “select wholeset” is made at each iteration, how can I completely remove the old set so that I don’t get an OOM ?

Thanks,
Arnaud

________________________________

L'intégrité de ce message n'étant pas assurée sur internet, la société expéditrice ne peut être tenue responsable de son contenu ni de ses pièces jointes. Toute utilisation ou diffusion non autorisée est interdite. Si vous n'êtes pas destinataire de ce message, merci de le détruire et d'avertir l'expéditeur.

The integrity of this message cannot be guaranteed on the Internet. The company that sent this message cannot therefore be held liable for its content nor attachments. Any unauthorized use or dissemination is prohibited. If you are not the intended recipient of this message, then please delete it and notify the sender.

RE: OutOfMemory when looping on dataset filter

Posted by "LINZ, Arnaud" <AL...@bouyguestelecom.fr>.
Hi,
It works with a local cluster, I effectively use a yarn cluster here.

Pushing user code to the lib folder of every datanode is not convenient ; it’s hard to maintain & exploit.

If I cannot make the treatment serializable to put everything in a group reduce function, I think I’ll try materializing the day-splitted dataset on the hdfs and then I’ll loop on re-reading it in the job manager. It’s even probably faster than looping on the full select.

Arnaud

De : Stephan Ewen [mailto:sewen@apache.org]
Envoyé : vendredi 9 décembre 2016 11:57
À : user@flink.apache.org
Objet : Re: OutOfMemory when looping on dataset filter

Hi Arnaud!

I assume you are using either a standalone setup, or a YARN session?

This looks to me as if classes cannot be properly garbage collected. Since each job (each day is executed as a separate job), loads the classes again, the PermGen space runs over if classes are not properly collected.

The can be many reasons why classes are not properly collected, most prominently some user code or libraries create threads that hold onto objects.

A quick workaround could be to simply add the relevant libraries directly to the "lib" folder when starting the YARN session, and not having them in the user code jar file. That way, they need not be reloaded for each job.

Greetings,
Stephan



On Fri, Dec 9, 2016 at 11:30 AM, LINZ, Arnaud <AL...@bouyguestelecom.fr>> wrote:
Hi,
Caching could have been a solution. Another one is using a “group reduce” by day, but for that I need to make the “applyComplexNonDistributedTreatment” serializable, and that’s not an easy task.

1 & 2 - The OOM in my current test occurs in the 8th iteration (7 were successful). In this current test, only the first day have data, in others days the filter() returns an empty dataset.
3 – The OOM is in a task manager, during the “select” phase.

Digging further, I see it’s a PermGen OOM occurring during deserialization, not a heap one.

2016-12-08 17:38:27,835 ERROR org.apache.flink.runtime.taskmanager.Task                     - Task execution failed.
java.lang.OutOfMemoryError: PermGen space
                at sun.misc.Unsafe.defineClass(Native Method)
                at sun.reflect.ClassDefiner.defineClass(ClassDefiner.java:63)
                at sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:399)
                at sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:396)
                at java.security.AccessController.doPrivileged(Native Method)
                at sun.reflect.MethodAccessorGenerator.generate(MethodAccessorGenerator.java:395)
                at sun.reflect.MethodAccessorGenerator.generateSerializationConstructor(MethodAccessorGenerator.java:113)
                at sun.reflect.ReflectionFactory.newConstructorForSerialization(ReflectionFactory.java:331)
                at java.io.ObjectStreamClass.getSerializableConstructor(ObjectStreamClass.java:1376)
                at java.io.ObjectStreamClass.access$1500(ObjectStreamClass.java:72)
                at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:493)
                at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468)
                at java.security.AccessController.doPrivileged(Native Method)
                at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468)
                at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
                at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:602)
                at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)
                at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
                at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
                at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
                at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
                at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
                at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
                at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
                at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
                at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
                at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
                at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
                at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
                at org.apache.hive.hcatalog.common.HCatUtil.deserialize(HCatUtil.java:117)
                at org.apache.hive.hcatalog.mapreduce.HCatSplit.readFields(HCatSplit.java:139)
                at org.apache.flink.api.java.hadoop.mapreduce.wrapper.HadoopInputSplit.readObject(HadoopInputSplit.java:102)


De : Fabian Hueske [mailto:fhueske@gmail.com<ma...@gmail.com>]
Envoyé : vendredi 9 décembre 2016 10:51
À : user@flink.apache.org<ma...@flink.apache.org>
Objet : Re: OutOfMemory when looping on dataset filter

Hi Arnaud,
Flink does not cache data at the moment.
What happens is that for every day, the complete program is executed, i.e., also the program that computes wholeSet.
Each execution should be independent from each other and all temporary data be cleaned up.
Since Flink executes programs in a pipelined (or streaming) fashion, wholeSet is not kept in memory.
There is also no manual way to pin a DataSet in memory at the moment.

One think you could try is to push the day filter as close to the original source as possible.
This would reduce the size of intermediate results.
In general, Flink's DataSet API is implemented to work on managed memory. The most common reason for OOMs are user function that collect data on the heap.
However, this should not accumulate and be cleaned up after a job finished.
Collect can be a bit fragile here, because it moves all data to the client process.

I also have a few questions:
1. After how many iterations of the for loop is the OOM happening.
2. Is the data for all days of the same size?
3. Is the OOM happening in Flink or in the client process which fetches the result?
Best, Fabian


2016-12-09 10:35 GMT+01:00 LINZ, Arnaud <AL...@bouyguestelecom.fr>>:
Hello,

I have a non-distributed treatment to apply to a DataSet of timed events, one day after another in a flink batch.
My algorithm is:

// wholeSet is too big to fit in RAM with a collect(), so we cut it in pieces
DataSet wholeSet = [Select WholeSet];
for (day 1 to 31) {
                List<> dayData = wholeSet.filter(day).collect();
                applyComplexNonDistributedTreatment(dayData);
}

Even if each day can perfectly fit in RAM (I’ve made a test where only the first day have data), I quickly get a OOM in a task manager at one point in the loop, so I guess that the “wholeSet” si keeped several times times in Ram.

Two questions :

1)      Is there a better way of handling it where the “select wholeset” is made only once ?

2)      Even when the “select wholeset” is made at each iteration, how can I completely remove the old set so that I don’t get an OOM ?

Thanks,
Arnaud

________________________________

L'intégrité de ce message n'étant pas assurée sur internet, la société expéditrice ne peut être tenue responsable de son contenu ni de ses pièces jointes. Toute utilisation ou diffusion non autorisée est interdite. Si vous n'êtes pas destinataire de ce message, merci de le détruire et d'avertir l'expéditeur.

The integrity of this message cannot be guaranteed on the Internet. The company that sent this message cannot therefore be held liable for its content nor attachments. Any unauthorized use or dissemination is prohibited. If you are not the intended recipient of this message, then please delete it and notify the sender.



Re: OutOfMemory when looping on dataset filter

Posted by Stephan Ewen <se...@apache.org>.
Hi Arnaud!

I assume you are using either a standalone setup, or a YARN session?

This looks to me as if classes cannot be properly garbage collected. Since
each job (each day is executed as a separate job), loads the classes again,
the PermGen space runs over if classes are not properly collected.

The can be many reasons why classes are not properly collected, most
prominently some user code or libraries create threads that hold onto
objects.

A quick workaround could be to simply add the relevant libraries directly
to the "lib" folder when starting the YARN session, and not having them in
the user code jar file. That way, they need not be reloaded for each job.

Greetings,
Stephan



On Fri, Dec 9, 2016 at 11:30 AM, LINZ, Arnaud <AL...@bouyguestelecom.fr>
wrote:

> Hi,
>
> Caching could have been a solution. Another one is using a “group reduce”
> by day, but for that I need to make the “applyComplexNonDistributedTreatment”
> serializable, and that’s not an easy task.
>
>
>
> 1 & 2 - The OOM in my current test occurs in the 8th iteration (7 were
> successful). In this current test, only the first day have data, in others
> days the filter() returns an empty dataset.
>
> 3 – The OOM is in a task manager, during the “select” phase.
>
>
>
> Digging further, I see it’s a PermGen OOM occurring during
> deserialization, not a heap one.
>
>
>
> 2016-12-08 17:38:27,835 ERROR org.apache.flink.runtime.
> taskmanager.Task                     - Task execution failed.
>
> java.lang.OutOfMemoryError: PermGen space
>
>                 at sun.misc.Unsafe.defineClass(Native Method)
>
>                 at sun.reflect.ClassDefiner.defineClass(ClassDefiner.java:
> 63)
>
>                 at sun.reflect.MethodAccessorGenerator$1.run(
> MethodAccessorGenerator.java:399)
>
>                 at sun.reflect.MethodAccessorGenerator$1.run(
> MethodAccessorGenerator.java:396)
>
>                 at java.security.AccessController.doPrivileged(Native
> Method)
>
>                 at sun.reflect.MethodAccessorGenerator.generate(
> MethodAccessorGenerator.java:395)
>
>                 at sun.reflect.MethodAccessorGenerator.
> generateSerializationConstructor(MethodAccessorGenerator.java:113)
>
>                 at sun.reflect.ReflectionFactory.
> newConstructorForSerialization(ReflectionFactory.java:331)
>
>                 at java.io.ObjectStreamClass.getSerializableConstructor(
> ObjectStreamClass.java:1376)
>
>                 at java.io.ObjectStreamClass.
> access$1500(ObjectStreamClass.java:72)
>
>                 at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:
> 493)
>
>                 at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:
> 468)
>
>                 at java.security.AccessController.doPrivileged(Native
> Method)
>
>                 at java.io.ObjectStreamClass.<
> init>(ObjectStreamClass.java:468)
>
>                 at java.io.ObjectStreamClass.
> lookup(ObjectStreamClass.java:365)
>
>                 at java.io.ObjectStreamClass.initNonProxy(
> ObjectStreamClass.java:602)
>
>                 at java.io.ObjectInputStream.readNonProxyDesc(
> ObjectInputStream.java:1622)
>
>                 at java.io.ObjectInputStream.readClassDesc(
> ObjectInputStream.java:1517)
>
>                 at java.io.ObjectInputStream.readOrdinaryObject(
> ObjectInputStream.java:1771)
>
>                 at java.io.ObjectInputStream.
> readObject0(ObjectInputStream.java:1350)
>
>                 at java.io.ObjectInputStream.defaultReadFields(
> ObjectInputStream.java:1990)
>
>                 at java.io.ObjectInputStream.readSerialData(
> ObjectInputStream.java:1915)
>
>                 at java.io.ObjectInputStream.readOrdinaryObject(
> ObjectInputStream.java:1798)
>
>                 at java.io.ObjectInputStream.
> readObject0(ObjectInputStream.java:1350)
>
>                 at java.io.ObjectInputStream.defaultReadFields(
> ObjectInputStream.java:1990)
>
>                 at java.io.ObjectInputStream.readSerialData(
> ObjectInputStream.java:1915)
>
>                 at java.io.ObjectInputStream.readOrdinaryObject(
> ObjectInputStream.java:1798)
>
>                 at java.io.ObjectInputStream.
> readObject0(ObjectInputStream.java:1350)
>
>                 at java.io.ObjectInputStream.readObject(ObjectInputStream.
> java:370)
>
>                 at org.apache.hive.hcatalog.common.HCatUtil.deserialize(
> HCatUtil.java:117)
>
>                 at org.apache.hive.hcatalog.mapreduce.HCatSplit.
> readFields(HCatSplit.java:139)
>
>                 at org.apache.flink.api.java.hadoop.mapreduce.wrapper.
> HadoopInputSplit.readObject(HadoopInputSplit.java:102)
>
>
>
>
>
> *De :* Fabian Hueske [mailto:fhueske@gmail.com]
> *Envoyé :* vendredi 9 décembre 2016 10:51
> *À :* user@flink.apache.org
> *Objet :* Re: OutOfMemory when looping on dataset filter
>
>
>
> Hi Arnaud,
>
> Flink does not cache data at the moment.
>
> What happens is that for every day, the complete program is executed,
> i.e., also the program that computes wholeSet.
> Each execution should be independent from each other and all temporary
> data be cleaned up.
> Since Flink executes programs in a pipelined (or streaming) fashion,
> wholeSet is not kept in memory.
>
> There is also no manual way to pin a DataSet in memory at the moment.
>
>
>
> One think you could try is to push the day filter as close to the original
> source as possible.
>
> This would reduce the size of intermediate results.
>
> In general, Flink's DataSet API is implemented to work on managed memory.
> The most common reason for OOMs are user function that collect data on the
> heap.
>
> However, this should not accumulate and be cleaned up after a job finished.
>
> Collect can be a bit fragile here, because it moves all data to the client
> process.
>
>
>
> I also have a few questions:
>
> 1. After how many iterations of the for loop is the OOM happening.
>
> 2. Is the data for all days of the same size?
>
> 3. Is the OOM happening in Flink or in the client process which fetches
> the result?
>
> Best, Fabian
>
>
>
>
>
> 2016-12-09 10:35 GMT+01:00 LINZ, Arnaud <AL...@bouyguestelecom.fr>:
>
> Hello,
>
>
>
> I have a non-distributed treatment to apply to a DataSet of timed events,
> one day after another in a flink batch.
>
> My algorithm is:
>
>
>
> // wholeSet is too big to fit in RAM with a collect(), so we cut it in
> pieces
>
> DataSet wholeSet = [Select WholeSet];
>
> for (day 1 to 31) {
>
>                 List<> dayData = wholeSet.filter(day).collect();
>
>                 applyComplexNonDistributedTreatment(dayData);
>
> }
>
>
>
> Even if each day can perfectly fit in RAM (I’ve made a test where only the
> first day have data), I quickly get a OOM in a task manager at one point in
> the loop, so I guess that the “wholeSet” si keeped several times times in
> Ram.
>
>
>
> Two questions :
>
> 1)      Is there a better way of handling it where the “select wholeset”
> is made only once ?
>
> 2)      Even when the “select wholeset” is made at each iteration, how
> can I completely remove the old set so that I don’t get an OOM ?
>
>
>
> Thanks,
>
> Arnaud
>
>
> ------------------------------
>
>
> L'intégrité de ce message n'étant pas assurée sur internet, la société
> expéditrice ne peut être tenue responsable de son contenu ni de ses pièces
> jointes. Toute utilisation ou diffusion non autorisée est interdite. Si
> vous n'êtes pas destinataire de ce message, merci de le détruire et
> d'avertir l'expéditeur.
>
> The integrity of this message cannot be guaranteed on the Internet. The
> company that sent this message cannot therefore be held liable for its
> content nor attachments. Any unauthorized use or dissemination is
> prohibited. If you are not the intended recipient of this message, then
> please delete it and notify the sender.
>
>
>

RE: OutOfMemory when looping on dataset filter

Posted by "LINZ, Arnaud" <AL...@bouyguestelecom.fr>.
Hi,
Caching could have been a solution. Another one is using a “group reduce” by day, but for that I need to make the “applyComplexNonDistributedTreatment” serializable, and that’s not an easy task.

1 & 2 - The OOM in my current test occurs in the 8th iteration (7 were successful). In this current test, only the first day have data, in others days the filter() returns an empty dataset.
3 – The OOM is in a task manager, during the “select” phase.

Digging further, I see it’s a PermGen OOM occurring during deserialization, not a heap one.

2016-12-08 17:38:27,835 ERROR org.apache.flink.runtime.taskmanager.Task                     - Task execution failed.
java.lang.OutOfMemoryError: PermGen space
                at sun.misc.Unsafe.defineClass(Native Method)
                at sun.reflect.ClassDefiner.defineClass(ClassDefiner.java:63)
                at sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:399)
                at sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:396)
                at java.security.AccessController.doPrivileged(Native Method)
                at sun.reflect.MethodAccessorGenerator.generate(MethodAccessorGenerator.java:395)
                at sun.reflect.MethodAccessorGenerator.generateSerializationConstructor(MethodAccessorGenerator.java:113)
                at sun.reflect.ReflectionFactory.newConstructorForSerialization(ReflectionFactory.java:331)
                at java.io.ObjectStreamClass.getSerializableConstructor(ObjectStreamClass.java:1376)
                at java.io.ObjectStreamClass.access$1500(ObjectStreamClass.java:72)
                at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:493)
                at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468)
                at java.security.AccessController.doPrivileged(Native Method)
                at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468)
                at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
                at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:602)
                at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)
                at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
                at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
                at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
                at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
                at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
                at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
                at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
                at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
                at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
                at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
                at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
                at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
                at org.apache.hive.hcatalog.common.HCatUtil.deserialize(HCatUtil.java:117)
                at org.apache.hive.hcatalog.mapreduce.HCatSplit.readFields(HCatSplit.java:139)
                at org.apache.flink.api.java.hadoop.mapreduce.wrapper.HadoopInputSplit.readObject(HadoopInputSplit.java:102)


De : Fabian Hueske [mailto:fhueske@gmail.com]
Envoyé : vendredi 9 décembre 2016 10:51
À : user@flink.apache.org
Objet : Re: OutOfMemory when looping on dataset filter

Hi Arnaud,
Flink does not cache data at the moment.
What happens is that for every day, the complete program is executed, i.e., also the program that computes wholeSet.
Each execution should be independent from each other and all temporary data be cleaned up.
Since Flink executes programs in a pipelined (or streaming) fashion, wholeSet is not kept in memory.
There is also no manual way to pin a DataSet in memory at the moment.

One think you could try is to push the day filter as close to the original source as possible.
This would reduce the size of intermediate results.
In general, Flink's DataSet API is implemented to work on managed memory. The most common reason for OOMs are user function that collect data on the heap.
However, this should not accumulate and be cleaned up after a job finished.
Collect can be a bit fragile here, because it moves all data to the client process.

I also have a few questions:
1. After how many iterations of the for loop is the OOM happening.
2. Is the data for all days of the same size?
3. Is the OOM happening in Flink or in the client process which fetches the result?
Best, Fabian


2016-12-09 10:35 GMT+01:00 LINZ, Arnaud <AL...@bouyguestelecom.fr>>:
Hello,

I have a non-distributed treatment to apply to a DataSet of timed events, one day after another in a flink batch.
My algorithm is:

// wholeSet is too big to fit in RAM with a collect(), so we cut it in pieces
DataSet wholeSet = [Select WholeSet];
for (day 1 to 31) {
                List<> dayData = wholeSet.filter(day).collect();
                applyComplexNonDistributedTreatment(dayData);
}

Even if each day can perfectly fit in RAM (I’ve made a test where only the first day have data), I quickly get a OOM in a task manager at one point in the loop, so I guess that the “wholeSet” si keeped several times times in Ram.

Two questions :

1)      Is there a better way of handling it where the “select wholeset” is made only once ?

2)      Even when the “select wholeset” is made at each iteration, how can I completely remove the old set so that I don’t get an OOM ?

Thanks,
Arnaud

________________________________

L'intégrité de ce message n'étant pas assurée sur internet, la société expéditrice ne peut être tenue responsable de son contenu ni de ses pièces jointes. Toute utilisation ou diffusion non autorisée est interdite. Si vous n'êtes pas destinataire de ce message, merci de le détruire et d'avertir l'expéditeur.

The integrity of this message cannot be guaranteed on the Internet. The company that sent this message cannot therefore be held liable for its content nor attachments. Any unauthorized use or dissemination is prohibited. If you are not the intended recipient of this message, then please delete it and notify the sender.


Re: OutOfMemory when looping on dataset filter

Posted by Fabian Hueske <fh...@gmail.com>.
Hi Arnaud,

Flink does not cache data at the moment.
What happens is that for every day, the complete program is executed, i.e.,
also the program that computes wholeSet.
Each execution should be independent from each other and all temporary data
be cleaned up.
Since Flink executes programs in a pipelined (or streaming) fashion,
wholeSet is not kept in memory.
There is also no manual way to pin a DataSet in memory at the moment.

One think you could try is to push the day filter as close to the original
source as possible.
This would reduce the size of intermediate results.

In general, Flink's DataSet API is implemented to work on managed memory.
The most common reason for OOMs are user function that collect data on the
heap.
However, this should not accumulate and be cleaned up after a job finished.
Collect can be a bit fragile here, because it moves all data to the client
process.

I also have a few questions:
1. After how many iterations of the for loop is the OOM happening.
2. Is the data for all days of the same size?
3. Is the OOM happening in Flink or in the client process which fetches the
result?

Best, Fabian


2016-12-09 10:35 GMT+01:00 LINZ, Arnaud <AL...@bouyguestelecom.fr>:

> Hello,
>
>
>
> I have a non-distributed treatment to apply to a DataSet of timed events,
> one day after another in a flink batch.
>
> My algorithm is:
>
>
>
> // wholeSet is too big to fit in RAM with a collect(), so we cut it in
> pieces
>
> DataSet wholeSet = [Select WholeSet];
>
> for (day 1 to 31) {
>
>                 List<> dayData = wholeSet.filter(day).collect();
>
>                 applyComplexNonDistributedTreatment(dayData);
>
> }
>
>
>
> Even if each day can perfectly fit in RAM (I’ve made a test where only the
> first day have data), I quickly get a OOM in a task manager at one point in
> the loop, so I guess that the “wholeSet” si keeped several times times in
> Ram.
>
>
>
> Two questions :
>
> 1)      Is there a better way of handling it where the “select wholeset”
> is made only once ?
>
> 2)      Even when the “select wholeset” is made at each iteration, how
> can I completely remove the old set so that I don’t get an OOM ?
>
>
>
> Thanks,
>
> Arnaud
>
> ------------------------------
>
> L'intégrité de ce message n'étant pas assurée sur internet, la société
> expéditrice ne peut être tenue responsable de son contenu ni de ses pièces
> jointes. Toute utilisation ou diffusion non autorisée est interdite. Si
> vous n'êtes pas destinataire de ce message, merci de le détruire et
> d'avertir l'expéditeur.
>
> The integrity of this message cannot be guaranteed on the Internet. The
> company that sent this message cannot therefore be held liable for its
> content nor attachments. Any unauthorized use or dissemination is
> prohibited. If you are not the intended recipient of this message, then
> please delete it and notify the sender.
>