You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@harmony.apache.org by Apache Harmony Bootstrap JVM <bo...@earthlink.net> on 2005/10/19 19:47:48 UTC

Re: Some questions about the architecture

Rodrigo,

At some point, _somebody_ has to wait on I/O.  I agree
that this is not the most efficient implementation, but one
of the advantages it has is that it does not need _any_
gc_safepoint() type calls for read or write barriers.
I am _definitely_ interested in your suggestions, and
I think others will agree with you, but let's get the code
up and running as it stands so we can try other approaches
and compare what good things they bring to the table
instead of, or even in addition to, the existing approach.

The priorities that I set were (1) get the logic working
without resorting to design changes such as multi-threading,
then (2) optimize the implementation and evaluate
improvements and architectural changes, then (3) implement
improvements and architectural changes.  The same goes
for the object model using the OBJECT() macro and the
'robject' structure in 'jvm/src/object.h'.  And the CLASS()
macro, and the STACK() macro, and other components
that I have tried to implement in a modular fashion (see 'README'
for a discussion of this issue).  Let's get it working, then look into
design changes, even having more than one option available at
configuration time, compile time, or even run time, such as is
now the case with the HEAP_xxx() macros and the GC_xxx()
macros that Robin Garner has been asking about.

As to the 'jvm/src/timeslice.c' code, notice that each
time that SIGALRM is received, the handler sets a
volatile boolean that is read by the JVM inner loop
in 'while ( ... || (rfalse == pjvm->timeslice_expired))'
in 'jvm/src/opcode.c' to check if it is time to give the
next thread some time.  I don't expect this to be the
most efficient check, but it _should_ work properly
since I have unit tested the time slicing code, both
the while() test and the setting of the boolean in
timeslice_tick().  One thing I have heard on this
list is that one of the implementations, I think it was
IBM's Jikes (?), was that they chose an interpreter
over a JIT.  Now that is not directly related to time
slicing, but it does mean that a mechanism like what I
implemented does not have to have compile-time
support.

*** How about you JVM experts out there?  Do you have
      any wisdom for me on the subject of time slicing
      on an outer/inner interpreter loop interpreter
      implementation?  And compared to JIT?  Archie Cobb,
      what do you think?  How about you lurkers out there? ***

As to your question about setjmp/longjmp, I agree that
there are other ways to do it.  In fact, I originally used
stack walking in one sense to return from fatal errors
instead for my original implementation of the heap
allocator, which used malloc/free.  If I got an error
from malloc(), I simply returned a NULL pointer, which
I tested from the calling function.  If I got this error,
I returned to its caller with an error, and so on, all the
way up.  However, what happens when you have a
normally (void) return?  Use TRUE/FALSE instead?
Could be.  But the more I developed the code, the
harder this became to support.  Therefore, since fatal
errors kill the application anyway, I decided to _VASTLY_
simplify the code by using what is effectively the OO concept
of an exception as available in the 'C' runtime library
with setjmp/longjmp.  Notice that many complicated models
can end up with irresolvable terminal conditions and that
the simplest way to escape is back to a known good state.
This is the purpose of setjmp/longjmp.  Try this on for size
with any communication protocol implementation, such as
TCP/IP some time.  When you get to a snarled condition where
there just is not any graceful way out, the non-local character
of setjmp/longjmp cuts that knot instead of untying it with
horrible error code checking back up the stack.  This is why
I finally decided to go this way.  (Does this answer your main
question here?)

Also, I sort of get the impression that you may be blurring the
distinction between the native 'C' code runtime environment
and the virtual Java runtime environment when you talk
about serialization, security, GC, and JNI.  (This is _very_
easy to do!  This is why I begin my real-machine data types
with 'r' and Java data types with 'j'.  I was confusing myself
all the time!)  Obviously, there is no such thing as setjmp/longjmp
in the OO paradigm, but they do have a better method,
namely, the concept of the exception.  That is effectively
what I have tried to implement here in the native 'C' code
on the real platform, to use OO terms.  Did I misunderstand you?

Thanks,

Dan Lydick

-----Original Message-----
From: Rodrigo Kumpera <ku...@gmail.com>
Sent: Oct 19, 2005 11:54 AM
To: Apache Harmony Bootstrap JVM <bo...@earthlink.net>
Subject: Re: Some questions about the architecture

Dan,

Green threads are threads implemented by the jvm itself, as is done
right now by bootJVM. This model is very tricky when it comes to
implement I/O primitives (you don't want all threads to block while
one I/O operation is waiting to complete), the only advantage is that
synchronization inside the jvm code is a non-issue.

Usually it's better to use one native thread for each started java thread.

I've been reading the code for the timeslice stuff, why do you start
an extra thread if it doesn´t perform anything except receiving the
alarm signal? Why not use the interpreter thread for that?

[]'s
Rodrigo

On 10/19/05, Apache Harmony Bootstrap JVM <bo...@earthlink.net> wrote:
>
> Rodrigo,
>
> I'm not familiar with the term "green threads", so could you
> explain?  Does it mean how I implemented the JVM time
> slice timer in 'jvm/src/timeslice.c' or something else?
> Let me digress a bit to make sure I have properly explained
> how implemented JVM threads.
>
> Notice that I have simply implemented a pair of
> loops, almost _directly_ per the JVM spec, for the JVM threads.
> The outer loop is a while()/for() loop combination, found in
> jvm_run() in 'jvm/src/jvm.c', that monitors the thread table
> for no more user thread, via 'while(rture == no_user_threads)'
> and loops through each thread in the thread table via
> 'for(CURRENT_THREAD = ...)', and calls opcode_run()
> in 'jvm/src/opcode.c' through a state table macro expansion
> that resolves to the function threadstate_process_running()
> in 'jvm/src/threadstate.c'.  This opcode_run() function is
> where the virtual instructions are executed in the
> 'while (THREAD_STATE_RUNNING == ...)' loop.
>
> In this case, I don't use native threads for anything except
> the inner loop timeout condition.  How does this implementation
> fit into your question?
>
> Thanks,
>
> Dan Lydick
>
>
> -----Original Message-----
> From: Rodrigo Kumpera <ku...@gmail.com>
> Sent: Oct 19, 2005 9:39 AM
> To: Apache Harmony Bootstrap JVM <bo...@earthlink.net>
> Subject: Some questions about the architecture
>
> Hi Dan,
>
> I'm digging the threading system and I found that the bootJVM is using
> green threads, this performs pretty bad as most platforms have decent
> threading libraries now and suporting green threads will be a
> nightmare when it comes to implementing io primitives.
>
> Ignoring the synchronization requirements, using native threads is
> somewhat simpler as the jvm don´t need to care about context switch.
>
> Then looking at how exceptions are thrown I've got to say that using
> setjmp/longjmp is not the way to go, it´s better to have proper stack
> walking code as this is required by the runtime in many places
> (Serialization, Security, GC and JNI are some examples). Stack walking
> is a non-portable bitch, I know how it works on x68 hardware only.
>
> What I would suject is to use native threads and the hardware stack
> for parameters, locals and stack stuff. It will be a lot easier to
> integrate with JIT'ed code and GC later.
>
> The gc will need some "gc_safepoint()" calls in method calls and
> backedges of the methods to allow threads to be stopped for
> stop-the-world collections.
>
> Besides that, I'm really looking forward your work on Harmony.
>
> []'s
> Rodrigo
>
>

Re: Some questions about the architecture

Posted by Rodrigo Kumpera <ku...@gmail.com>.

On 10/19/05, Apache Harmony Bootstrap JVM <bo...@earthlink.net> wrote:
>
> Rodrigo,
>
> At some point, _somebody_ has to wait on I/O.  I agree
> that this is not the most efficient implementation, but one
> of the advantages it has is that it does not need _any_
> gc_safepoint() type calls for read or write barriers.
> I am _definitely_ interested in your suggestions, and
> I think others will agree with you, but let's get the code
> up and running as it stands so we can try other approaches
> and compare what good things they bring to the table
> instead of, or even in addition to, the existing approach.

I think I have not been clear enout. safepoints are needed by the
garbage collector to know when is safe to stop a given thread (in
bounded time) for a stop-the-world garbage collection. This have
nothing to do with read/write barriers.

For exemple, as I understand, JikesRVM implements gc safepoints (the
points in the bytecode where gc maps are generated) at loop backedges
and method calls.

> The priorities that I set were (1) get the logic working
> without resorting to design changes such as multi-threading,
> then (2) optimize the implementation and evaluate
> improvements and architectural changes, then (3) implement
> improvements and architectural changes.  The same goes
> for the object model using the OBJECT() macro and the
> 'robject' structure in 'jvm/src/object.h'.  And the CLASS()
> macro, and the STACK() macro, and other components
> that I have tried to implement in a modular fashion (see 'README'
> for a discussion of this issue).  Let's get it working, then look into
> design changes, even having more than one option available at
> configuration time, compile time, or even run time, such as is
> now the case with the HEAP_xxx() macros and the GC_xxx()
> macros that Robin Garner has been asking about.
>
> As to the 'jvm/src/timeslice.c' code, notice that each
> time that SIGALRM is received, the handler sets a
> volatile boolean that is read by the JVM inner loop
> in 'while ( ... || (rfalse == pjvm->timeslice_expired))'
> in 'jvm/src/opcode.c' to check if it is time to give the
> next thread some time.  I don't expect this to be the
> most efficient check, but it _should_ work properly
> since I have unit tested the time slicing code, both
> the while() test and the setting of the boolean in
> timeslice_tick().  One thing I have heard on this
> list is that one of the implementations, I think it was
> IBM's Jikes (?), was that they chose an interpreter
> over a JIT.  Now that is not directly related to time
> slicing, but it does mean that a mechanism like what I
> implemented does not have to have compile-time
> support.
>
> *** How about you JVM experts out there?  Do you have
>       any wisdom for me on the subject of time slicing
>       on an outer/inner interpreter loop interpreter
>       implementation?  And compared to JIT?  Archie Cobb,
>       what do you think?  How about you lurkers out there? ***

All open source JVMs I checked use native threads, you can take a look
at how IBM did with Native POSIX Threading Library (NPTL), as it
implement userland threads on linux.


> As to your question about setjmp/longjmp, I agree that
> there are other ways to do it.  In fact, I originally used
> stack walking in one sense to return from fatal errors
> instead for my original implementation of the heap
> allocator, which used malloc/free.  If I got an error
> from malloc(), I simply returned a NULL pointer, which
> I tested from the calling function.  If I got this error,
> I returned to its caller with an error, and so on, all the
> way up.  However, what happens when you have a
> normally (void) return?  Use TRUE/FALSE instead?
> Could be.  But the more I developed the code, the
> harder this became to support.  Therefore, since fatal
> errors kill the application anyway, I decided to _VASTLY_
> simplify the code by using what is effectively the OO concept
> of an exception as available in the 'C' runtime library
> with setjmp/longjmp.  Notice that many complicated models
> can end up with irresolvable terminal conditions and that
> the simplest way to escape is back to a known good state.
> This is the purpose of setjmp/longjmp.  Try this on for size
> with any communication protocol implementation, such as
> TCP/IP some time.  When you get to a snarled condition where
> there just is not any graceful way out, the non-local character
> of setjmp/longjmp cuts that knot instead of untying it with
> horrible error code checking back up the stack.  This is why
> I finally decided to go this way.  (Does this answer your main
> question here?)

It does, but by stack walking I meant not returning null, but having
the code analise the call stack for a proper IP address to use.

> Also, I sort of get the impression that you may be blurring the
> distinction between the native 'C' code runtime environment
> and the virtual Java runtime environment when you talk
> about serialization, security, GC, and JNI.  (This is _very_
> easy to do!  This is why I begin my real-machine data types
> with 'r' and Java data types with 'j'.  I was confusing myself
> all the time!)  Obviously, there is no such thing as setjmp/longjmp
> in the OO paradigm, but they do have a better method,
> namely, the concept of the exception.  That is effectively
> what I have tried to implement here in the native 'C' code
> on the real platform, to use OO terms.  Did I misunderstand you?
>

Not exactly, GC must walk the stack to find the root set;
Serialization needs to find what is the last user class loader on
stack (since it's the one used to lookup classes for deserialization);
Security needs to walk the stack for performing checks on the code
base of each method on on; and JNI needs this as exceptions are queued
for using by the ExceptionOccurred call.

I did look at opcode.c and thread.c but I could not find the stack
unwinding code, could you  point me where it is located?

> Thanks,
>
>
> Dan Lydick
>
> -----Original Message-----
> From: Rodrigo Kumpera <ku...@gmail.com>
> Sent: Oct 19, 2005 11:54 AM
> To: Apache Harmony Bootstrap JVM <bo...@earthlink.net>
> Subject: Re: Some questions about the architecture
>
> Dan,
>
> Green threads are threads implemented by the jvm itself, as is done
> right now by bootJVM. This model is very tricky when it comes to
> implement I/O primitives (you don't want all threads to block while
> one I/O operation is waiting to complete), the only advantage is that
> synchronization inside the jvm code is a non-issue.
>
> Usually it's better to use one native thread for each started java thread.
>
> I've been reading the code for the timeslice stuff, why do you start
> an extra thread if it doesn´t perform anything except receiving the
> alarm signal? Why not use the interpreter thread for that?
>
> []'s
> Rodrigo
>
> On 10/19/05, Apache Harmony Bootstrap JVM <bo...@earthlink.net> wrote:
> >
> > Rodrigo,
> >
> > I'm not familiar with the term "green threads", so could you
> > explain?  Does it mean how I implemented the JVM time
> > slice timer in 'jvm/src/timeslice.c' or something else?
> > Let me digress a bit to make sure I have properly explained
> > how implemented JVM threads.
> >
> > Notice that I have simply implemented a pair of
> > loops, almost _directly_ per the JVM spec, for the JVM threads.
> > The outer loop is a while()/for() loop combination, found in
> > jvm_run() in 'jvm/src/jvm.c', that monitors the thread table
> > for no more user thread, via 'while(rture == no_user_threads)'
> > and loops through each thread in the thread table via
> > 'for(CURRENT_THREAD = ...)', and calls opcode_run()
> > in 'jvm/src/opcode.c' through a state table macro expansion
> > that resolves to the function threadstate_process_running()
> > in 'jvm/src/threadstate.c'.  This opcode_run() function is
> > where the virtual instructions are executed in the
> > 'while (THREAD_STATE_RUNNING == ...)' loop.
> >
> > In this case, I don't use native threads for anything except
> > the inner loop timeout condition.  How does this implementation
> > fit into your question?
> >
> > Thanks,
> >
> > Dan Lydick
> >
> >
> > -----Original Message-----
> > From: Rodrigo Kumpera <ku...@gmail.com>
> > Sent: Oct 19, 2005 9:39 AM
> > To: Apache Harmony Bootstrap JVM <bo...@earthlink.net>
> > Subject: Some questions about the architecture
> >
> > Hi Dan,
> >
> > I'm digging the threading system and I found that the bootJVM is using
> > green threads, this performs pretty bad as most platforms have decent
> > threading libraries now and suporting green threads will be a
> > nightmare when it comes to implementing io primitives.
> >
> > Ignoring the synchronization requirements, using native threads is
> > somewhat simpler as the jvm don´t need to care about context switch.
> >
> > Then looking at how exceptions are thrown I've got to say that using
> > setjmp/longjmp is not the way to go, it´s better to have proper stack
> > walking code as this is required by the runtime in many places
> > (Serialization, Security, GC and JNI are some examples). Stack walking
> > is a non-portable bitch, I know how it works on x68 hardware only.
> >
> > What I would suject is to use native threads and the hardware stack
> > for parameters, locals and stack stuff. It will be a lot easier to
> > integrate with JIT'ed code and GC later.
> >
> > The gc will need some "gc_safepoint()" calls in method calls and
> > backedges of the methods to allow threads to be stopped for
> > stop-the-world collections.
> >
> > Besides that, I'm really looking forward your work on Harmony.
> >
> > []'s
> > Rodrigo
> >
> >
>
>

Re: Some questions about the architecture

Posted by Archie Cobbs <ar...@dellroad.org>.

Apache Harmony Bootstrap JVM wrote:
> *** How about you JVM experts out there?  Do you have
>       any wisdom for me on the subject of time slicing
>       on an outer/inner interpreter loop interpreter
>       implementation?  And compared to JIT?  Archie Cobb,
>       what do you think?  How about you lurkers out there? ***

Not sure if this qualifies as "wisdom" but here's how JCVM works:

- POSIX threads are used so no explicit timeslicing is needed.

- At every backward branch, executable code (or the interpreter)
   reads from well-known memory location. When one thread wants to
   get the attention of all other threads real fast, it munmap()'s
   the memory page containing this address.

More details are in the JCVM manual:

   http://jcvm.sourceforge.net/share/jc/doc/jc.html

-Archie

__________________________________________________________________________
Archie Cobbs      *        CTO, Awarix        *      http://www.awarix.com