You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Xiao-Feng Li <xi...@gmail.com> on 2007/02/14 03:47:39 UTC

[DRLVM][GC] low pause garbage collection for Harmony

Hi,

Harmony now has a reasonably advanced and stable parallel/generational
GC built for 32bit platforms (the GCv5). The remaining work for GCv5 I
think is mainly about 64bit port and leverage of large heap size
enabled by 64bit, while performance tuning is always a continuous
effort.

Besides the ongoing work of GCv5, I would like to start thinking of a
low-pause garbage collector for Harmony now, since some Harmony users
might expect their applicaitons' execution interrupt for garbage
collection to be as short as possible. For them, the "throughput" of
GC is not all they want. GC's "pause time" or "latency" or "response
time" is critical as well.

Low-pause GC usually means "concurrent GC", in contrast to
"stop-the-world GC". In concurrent GC, the mutators (application
threads) can keep running while collectors (GC threads) are doing
garbage collection. GCv5 so far is a "stop-the-world" GC, where all
the mutators are suspended when a collection is started.

The concept "parallel" is orthogonal to "concurrent". "Parallel" GC
refers to that a collection can be conducted by multiple collector
threads simultaneously. "Generational" is orthogonal as well.

There is a claimed "pauseless GC" by Azul Systems [1], which depends
on Azul's specific hardware support for read/write barriers. Without
HW support, read barriers can be expensive [2]; but I think a
very-short-pause-time GC is acceptable for Harmony, at least good
enough in the near future.

Some researchers seperate "on-the-fly" GC from concurrent GC as a
special case [3]. The difference as stated is "on-the-fly" GC doesn't
require any synchronization point where all mutators are suspended,
i.e., it suspends and resumes mutators one after another, not at the
same time. There is also "real-time" GC proposed that can satisfy
required real-time bounds. Metronome is one example [4].

Considering the support in available platforms and Harmony's
objectives, an on-the-fly GC might be our choice. But before that, we
can have a traditional concurrent GC implemented, and adapt it into
on-the-fly.

Anybody has good advices? Thanks.

-xiaofeng

[1] http://www.usenix.org/events/vee05/full_papers/p46-click.pdf
[2] http://cs.anu.edu.au/~Steve.Blackburn/pubs/papers/wb-ismm-2004.pdf
[3] http://www.cs.technion.ac.il/~erez/Papers/ms-sliding-views.ps
[4] http://www.research.ibm.com/people/d/dfb/papers/Bacon03Realtime.pdf

Re: [DRLVM][GC] low pause garbage collection for Harmony

Posted by Ivan Volosyuk <iv...@gmail.com>.
On 2/16/07, Xiao-Feng Li <xi...@gmail.com> wrote:
> On 2/15/07, Ivan Volosyuk <iv...@gmail.com> wrote:
> > On 2/14/07, Xiao-Feng Li <xi...@gmail.com> wrote:
> > > Some researchers seperate "on-the-fly" GC from concurrent GC as a
> > > special case [3]. The difference as stated is "on-the-fly" GC doesn't
> > > require any synchronization point where all mutators are suspended,
> > > i.e., it suspends and resumes mutators one after another, not at the
> > > same time. There is also "real-time" GC proposed that can satisfy
> > > required real-time bounds. Metronome is one example [4].
> >
> > > [3] http://www.cs.technion.ac.il/~erez/Papers/ms-sliding-views.ps
> >
> > I have taken a look at the ms-sliding-views.ps paper. The on-the-fly
> > GC looks quite promising. One thing that bothers me is it is of 'mark
> > and sweep' type. It require different allocator algorithm to keep
> > fragmentation low, it means for me (correct me) that the allocation
> > rate would be rather slow and it will have suboptimal data locality. I
> > understand that we won't have such low pauses if we would like to move
> > data on a GC without read barriers. One thing which really excite me
> > that it doesn't require stop-the-world at all, each thread is stopped
> > independently, that means that it's really scalable by the number of
> > running threads.
>
> Ivan, thanks. I have basically the same understanding of the
> on-the-fly mark-sweep algortihm as you. It may need compaction
> algorithm to reduce the accumulated fragmentation effect. Compressor
> can compact the heap with rather low pause-time by leveraging the OS
> mem-protect mechanism and tripple-mapping the virtual space.

It should be very specific compaction algorithm. We need to presume
general heap structure: per-processor segregated free-lists, large
object space and all the corresponding metadata.

--
Ivan

Re: [DRLVM][GC] low pause garbage collection for Harmony

Posted by Xiao-Feng Li <xi...@gmail.com>.
On 2/15/07, Ivan Volosyuk <iv...@gmail.com> wrote:
> On 2/14/07, Xiao-Feng Li <xi...@gmail.com> wrote:
> > Some researchers seperate "on-the-fly" GC from concurrent GC as a
> > special case [3]. The difference as stated is "on-the-fly" GC doesn't
> > require any synchronization point where all mutators are suspended,
> > i.e., it suspends and resumes mutators one after another, not at the
> > same time. There is also "real-time" GC proposed that can satisfy
> > required real-time bounds. Metronome is one example [4].
>
> > [3] http://www.cs.technion.ac.il/~erez/Papers/ms-sliding-views.ps
>
> I have taken a look at the ms-sliding-views.ps paper. The on-the-fly
> GC looks quite promising. One thing that bothers me is it is of 'mark
> and sweep' type. It require different allocator algorithm to keep
> fragmentation low, it means for me (correct me) that the allocation
> rate would be rather slow and it will have suboptimal data locality. I
> understand that we won't have such low pauses if we would like to move
> data on a GC without read barriers. One thing which really excite me
> that it doesn't require stop-the-world at all, each thread is stopped
> independently, that means that it's really scalable by the number of
> running threads.

Ivan, thanks. I have basically the same understanding of the
on-the-fly mark-sweep algortihm as you. It may need compaction
algorithm to reduce the accumulated fragmentation effect. Compressor
can compact the heap with rather low pause-time by leveraging the OS
mem-protect mechanism and tripple-mapping the virtual space.

Thanks,
xiaofeng

> --
> Ivan
> Intel Enterprise Solutions Software Division
>

Re: [DRLVM][GC] low pause garbage collection for Harmony

Posted by Ivan Volosyuk <iv...@gmail.com>.
On 2/14/07, Xiao-Feng Li <xi...@gmail.com> wrote:
> Some researchers seperate "on-the-fly" GC from concurrent GC as a
> special case [3]. The difference as stated is "on-the-fly" GC doesn't
> require any synchronization point where all mutators are suspended,
> i.e., it suspends and resumes mutators one after another, not at the
> same time. There is also "real-time" GC proposed that can satisfy
> required real-time bounds. Metronome is one example [4].

> [3] http://www.cs.technion.ac.il/~erez/Papers/ms-sliding-views.ps

I have taken a look at the ms-sliding-views.ps paper. The on-the-fly
GC looks quite promising. One thing that bothers me is it is of 'mark
and sweep' type. It require different allocator algorithm to keep
fragmentation low, it means for me (correct me) that the allocation
rate would be rather slow and it will have suboptimal data locality. I
understand that we won't have such low pauses if we would like to move
data on a GC without read barriers. One thing which really excite me
that it doesn't require stop-the-world at all, each thread is stopped
independently, that means that it's really scalable by the number of
running threads.

-- 
Ivan
Intel Enterprise Solutions Software Division

Re: [DRLVM][GC] low pause garbage collection for Harmony

Posted by Xiao-Feng Li <xi...@gmail.com>.
On 2/15/07, Rana Dasgupta <rd...@gmail.com> wrote:
> On 2/14/07, Xiao-Feng Li <xi...@gmail.com> wrote:
> >
> >
> > >I would suggest to do it after the 64bit support is ready. A new patch
> > >will be submitted for it soon (probably in one or two weeks).
> >
> How about letting us know when it is ready for serious testing, and maybe
> we can run a trial week with gcv5 as the default? That should identify all
> the bugs. The performance comparisons also need to be done, but possibly
> after this.

I think two weeks are ok. Thanks, xiaofeng

Re: [DRLVM][GC] low pause garbage collection for Harmony

Posted by Rana Dasgupta <rd...@gmail.com>.
On 2/14/07, Xiao-Feng Li <xi...@gmail.com> wrote:
>
>
> >I would suggest to do it after the 64bit support is ready. A new patch
> >will be submitted for it soon (probably in one or two weeks).
>
> How about letting us know when it is ready for serious testing, and maybe
> we can run a trial week with gcv5 as the default? That should identify all
> the bugs. The performance comparisons also need to be done, but possibly
> after this.

Re: [DRLVM][GC] low pause garbage collection for Harmony

Posted by Xiao-Feng Li <xi...@gmail.com>.
On 2/14/07, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>
> On Feb 14, 2007, at 7:33 AM, Gregory Shimansky wrote:
>
> > Xiao-Feng Li wrote:
> >> Hi,
> >> Harmony now has a reasonably advanced and stable parallel/
> >> generational
> >> GC built for 32bit platforms (the GCv5). The remaining work for
> >> GCv5 I
> >> think is mainly about 64bit port and leverage of large heap size
> >> enabled by 64bit, while performance tuning is always a continuous
> >> effort.
> >
> > If we have GCv5 ready for testing, should we start running tests
> > with it?
>
> That was going to be my suggest - start testing and evaluate how it
> performs against v4, how it does w/ the test suites, and then decide
> if it's time to switch to it.

I would suggest to do it after the 64bit support is ready. A new patch
will be submitted for it soon (probably in one or two weeks).

Btw, since it's close to Chinese New Year, please expect delayed
response from me since I am doing a stop-the-world engergy collection.
:-)  You see I am trying to conduct the collection concurrently, at
least with a shorter reponse latency than last collection cycle.

Thanks,
xiaofeng

> geir
>
> >
> >> Besides the ongoing work of GCv5, I would like to start thinking of a
> >> low-pause garbage collector for Harmony now, since some Harmony users
> >> might expect their applicaitons' execution interrupt for garbage
> >> collection to be as short as possible. For them, the "throughput" of
> >> GC is not all they want. GC's "pause time" or "latency" or "response
> >> time" is critical as well.
> >> Low-pause GC usually means "concurrent GC", in contrast to
> >> "stop-the-world GC". In concurrent GC, the mutators (application
> >> threads) can keep running while collectors (GC threads) are doing
> >> garbage collection. GCv5 so far is a "stop-the-world" GC, where all
> >> the mutators are suspended when a collection is started.
> >> The concept "parallel" is orthogonal to "concurrent". "Parallel" GC
> >> refers to that a collection can be conducted by multiple collector
> >> threads simultaneously. "Generational" is orthogonal as well.
> >> There is a claimed "pauseless GC" by Azul Systems [1], which depends
> >> on Azul's specific hardware support for read/write barriers. Without
> >> HW support, read barriers can be expensive [2]; but I think a
> >> very-short-pause-time GC is acceptable for Harmony, at least good
> >> enough in the near future.
> >> Some researchers seperate "on-the-fly" GC from concurrent GC as a
> >> special case [3]. The difference as stated is "on-the-fly" GC doesn't
> >> require any synchronization point where all mutators are suspended,
> >> i.e., it suspends and resumes mutators one after another, not at the
> >> same time. There is also "real-time" GC proposed that can satisfy
> >> required real-time bounds. Metronome is one example [4].
> >> Considering the support in available platforms and Harmony's
> >> objectives, an on-the-fly GC might be our choice. But before that, we
> >> can have a traditional concurrent GC implemented, and adapt it into
> >> on-the-fly.
> >> Anybody has good advices? Thanks.
> >> -xiaofeng
> >> [1] http://www.usenix.org/events/vee05/full_papers/p46-click.pdf
> >> [2] http://cs.anu.edu.au/~Steve.Blackburn/pubs/papers/wb-
> >> ismm-2004.pdf
> >> [3] http://www.cs.technion.ac.il/~erez/Papers/ms-sliding-views.ps
> >> [4] http://www.research.ibm.com/people/d/dfb/papers/
> >> Bacon03Realtime.pdf
> >
> >
> > --
> > Gregory
> >
>
>

Re: [DRLVM][GC] low pause garbage collection for Harmony

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.
On Feb 14, 2007, at 7:33 AM, Gregory Shimansky wrote:

> Xiao-Feng Li wrote:
>> Hi,
>> Harmony now has a reasonably advanced and stable parallel/ 
>> generational
>> GC built for 32bit platforms (the GCv5). The remaining work for  
>> GCv5 I
>> think is mainly about 64bit port and leverage of large heap size
>> enabled by 64bit, while performance tuning is always a continuous
>> effort.
>
> If we have GCv5 ready for testing, should we start running tests  
> with it?

That was going to be my suggest - start testing and evaluate how it  
performs against v4, how it does w/ the test suites, and then decide  
if it's time to switch to it.

geir

>
>> Besides the ongoing work of GCv5, I would like to start thinking of a
>> low-pause garbage collector for Harmony now, since some Harmony users
>> might expect their applicaitons' execution interrupt for garbage
>> collection to be as short as possible. For them, the "throughput" of
>> GC is not all they want. GC's "pause time" or "latency" or "response
>> time" is critical as well.
>> Low-pause GC usually means "concurrent GC", in contrast to
>> "stop-the-world GC". In concurrent GC, the mutators (application
>> threads) can keep running while collectors (GC threads) are doing
>> garbage collection. GCv5 so far is a "stop-the-world" GC, where all
>> the mutators are suspended when a collection is started.
>> The concept "parallel" is orthogonal to "concurrent". "Parallel" GC
>> refers to that a collection can be conducted by multiple collector
>> threads simultaneously. "Generational" is orthogonal as well.
>> There is a claimed "pauseless GC" by Azul Systems [1], which depends
>> on Azul's specific hardware support for read/write barriers. Without
>> HW support, read barriers can be expensive [2]; but I think a
>> very-short-pause-time GC is acceptable for Harmony, at least good
>> enough in the near future.
>> Some researchers seperate "on-the-fly" GC from concurrent GC as a
>> special case [3]. The difference as stated is "on-the-fly" GC doesn't
>> require any synchronization point where all mutators are suspended,
>> i.e., it suspends and resumes mutators one after another, not at the
>> same time. There is also "real-time" GC proposed that can satisfy
>> required real-time bounds. Metronome is one example [4].
>> Considering the support in available platforms and Harmony's
>> objectives, an on-the-fly GC might be our choice. But before that, we
>> can have a traditional concurrent GC implemented, and adapt it into
>> on-the-fly.
>> Anybody has good advices? Thanks.
>> -xiaofeng
>> [1] http://www.usenix.org/events/vee05/full_papers/p46-click.pdf
>> [2] http://cs.anu.edu.au/~Steve.Blackburn/pubs/papers/wb- 
>> ismm-2004.pdf
>> [3] http://www.cs.technion.ac.il/~erez/Papers/ms-sliding-views.ps
>> [4] http://www.research.ibm.com/people/d/dfb/papers/ 
>> Bacon03Realtime.pdf
>
>
> -- 
> Gregory
>


Re: [DRLVM][GC] low pause garbage collection for Harmony

Posted by Gregory Shimansky <gs...@gmail.com>.
Xiao-Feng Li wrote:
> Hi,
> 
> Harmony now has a reasonably advanced and stable parallel/generational
> GC built for 32bit platforms (the GCv5). The remaining work for GCv5 I
> think is mainly about 64bit port and leverage of large heap size
> enabled by 64bit, while performance tuning is always a continuous
> effort.

If we have GCv5 ready for testing, should we start running tests with it?

> Besides the ongoing work of GCv5, I would like to start thinking of a
> low-pause garbage collector for Harmony now, since some Harmony users
> might expect their applicaitons' execution interrupt for garbage
> collection to be as short as possible. For them, the "throughput" of
> GC is not all they want. GC's "pause time" or "latency" or "response
> time" is critical as well.
> 
> Low-pause GC usually means "concurrent GC", in contrast to
> "stop-the-world GC". In concurrent GC, the mutators (application
> threads) can keep running while collectors (GC threads) are doing
> garbage collection. GCv5 so far is a "stop-the-world" GC, where all
> the mutators are suspended when a collection is started.
> 
> The concept "parallel" is orthogonal to "concurrent". "Parallel" GC
> refers to that a collection can be conducted by multiple collector
> threads simultaneously. "Generational" is orthogonal as well.
> 
> There is a claimed "pauseless GC" by Azul Systems [1], which depends
> on Azul's specific hardware support for read/write barriers. Without
> HW support, read barriers can be expensive [2]; but I think a
> very-short-pause-time GC is acceptable for Harmony, at least good
> enough in the near future.
> 
> Some researchers seperate "on-the-fly" GC from concurrent GC as a
> special case [3]. The difference as stated is "on-the-fly" GC doesn't
> require any synchronization point where all mutators are suspended,
> i.e., it suspends and resumes mutators one after another, not at the
> same time. There is also "real-time" GC proposed that can satisfy
> required real-time bounds. Metronome is one example [4].
> 
> Considering the support in available platforms and Harmony's
> objectives, an on-the-fly GC might be our choice. But before that, we
> can have a traditional concurrent GC implemented, and adapt it into
> on-the-fly.
> 
> Anybody has good advices? Thanks.
> 
> -xiaofeng
> 
> [1] http://www.usenix.org/events/vee05/full_papers/p46-click.pdf
> [2] http://cs.anu.edu.au/~Steve.Blackburn/pubs/papers/wb-ismm-2004.pdf
> [3] http://www.cs.technion.ac.il/~erez/Papers/ms-sliding-views.ps
> [4] http://www.research.ibm.com/people/d/dfb/papers/Bacon03Realtime.pdf
> 


-- 
Gregory