You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Egor Pasko <eg...@gmail.com> on 2009/03/10 09:25:06 UTC

Re: Fwd: [jira] Created: (HARMONY-6007) [drlvm][jit][abcd] classic abcd pass fixes

On the 0x511 day of Apache Harmony Naveen Neelakantam wrote:
>> on first implementation we checked BidirectionalBubbleSort with the
>> current ABCD. It removed all checks. IMHO, this is a good argument to
>> prove that the issue is fixable without introducing principially new
>> concepts.
>>    
>
> I disagree with your assessment.  I believe that although ABCD did
> work once, it is actually quite fragile.  My fix removes a major
> source of this fragility.
>
> And as for introducing new concepts, I believe that has already
> happened when you decided to introduce an e-SSA that has enhanced pi
> nodes and that is used for both the upper bound and lower bound
> problems simultaneously.  I think those were good changes based on
> sound understanding.  I believe my change also falls into this
> category.  Please take the time to understand what I am proposing.

Naveen,

sorry for the very late reply. It is not only me very busy, but also
there had to be a lot of experimentation to find out what is
happening.

The short story is: "Your patch in HARMONY-6007 is great, awesome,
it's operation is absolutely correct, it reveals a bunch of serious
existing problems. I am very thankful for these. Fixing these
problems covers all issues and adds extra benefits. So, I am going to
stay with the new version."

Now the full story, take a good seat, have a nice drink and leesen
carefully :) I would be really glad to see comments, objections,
advice, counterexamples, etc.

Preface.

I attached my patch to HARMONY-6007, please, take a look (the solver
contains a lot of trailing space removal, sorry for that).

Chapter 1. Passes.

Optimization passed preceding to "classic_abcd" are:
"memopt,dce,uce,simplify,dce,uce". I removed "hvn" since I did not
find cases, where it becomes useful in presence of
"memopt". "simplify" is required immediately before "classic_abcd"
(modulo "dce,uce") because the latter relies on constants being
folded, you can see it by running DaCapo pmd benchmark in debug mode
using server_static pass.

Chapter 2. Revealed problems.

Naveen, your patch does a great job at termination stage at finding
arraylength operands even if they are unconnected
(i.e. unconstrained). Array length operand being unconstrained is a
bug in "classic_abcd", so I jumped in at fixing this. The problem
appeared in this pseudocoded situation: "call A() -> int x;
newarray(size=x);". The fix contains two parts: (1) handling more load
operations that can produce arraylength operands (Op_LdStatic,
Op_TauLdInd, Op_DirectCall, Op_TauVirtualCall, etc.) and (2) not
renaming array length operands in chkbounds instructions so that there
is a single instance of this arraylen to find in the solver.

Meanwhile I spotted another problem with our approach. Consider the code:
int arr[] = new int[limit];
if (i < limit) {
    if (0 <= i) {
        arr[i] = 5;
    }
}

both original version and your patched version failed to eliminate the
check here. Just because pi renaming logic was incorrect (in fact I
never looked into it before and relied on many assumptions about what
in fact it does not do). And there was a bug that relied on the fact
that Pi instructions always start the block. This is no longer the
case when we insert tauedge() (this was probably why it broke, not
sure). Weird, we have a lot of such crap in JIT. So, I fixed, cleaned
that up a little. I was lazy to verify that it works right with
nontrivial code structures. Small examples like the above worked. If
you spot something weird with a relatively small test, I would very
much appreciate and will fix that faster than this time.

Chapter 3. Results.

I ran DaCapo with both patches in server_static mode with debug build
(all I could, not everything works, dammit). Original patch in
HARMONY-6007 ate 9.5% checks, while the new patch ate 9.9555%. Among
the whole run I found only 2 methods, where original patch
outperformed the new one, but again I was lazy to spot the problem
since the methods were quite big.

Almost forgot, regression tests passed with both patches. I should
commit them into the codebase as real regression tests. TODO.

Chapter 4. Problems remaining.

My solution is still "more fragile" in terms of being resistant to
adding new opcodes. Whoever adds new opcodes should be careful about
setting the constrained/unconstrained status on them in the inequality
graph. I'd like to stick with this solution though, as I said, to
minimize the amount of levels of abstraction.

Another problem: memopt sucks .. at finding aliases of arraylen loads
if we are working with fields and not with arrays on stack.

Example:
public class field {
    public Object ss = new Object();
    public Object[] objs = null;
    private Object[] run(int l) {
        objs = new Object[l];
        for (int i = l-1; i >= 0; i--) {
            objs[i] = ss;
        }
        return objs;
    }
    public static void main(String[] args) {
        field t = new field();
        System.out.println(t.run(10)[0]);
    }
}

this is probably not killing performance in real cases because there
is usually some external call in the loop that can potentially rewrite
the field (using reflection, argh).

If anyone finds more problems with bounds checks removal, please,
report with tests. To find out how many bounds checks was
detected/eliminated for each method, this DRLVM option is useful:
-XX:jit.arg.dump_abcd_stats=true, this dumps the info in the
./bounds_checks.log (Though do not forget that other optimizations can
also eliminate bounds checks as hvn,dabce,fastArrayFill).

Chapter 5. Conclusions.

Naveen, thank you thank you thank you for the great work! Without it I
would have been clueless about important problem sources for a much
longer while. Great job!

-- 
Egor Pasko


Re: Fwd: [jira] Created: (HARMONY-6007) [drlvm][jit][abcd] classic abcd pass fixes

Posted by Egor Pasko <eg...@gmail.com>.
On the 0x572 day of Apache Harmony Naveen Neelakantam wrote:
> Hi Egor,
>
> I had started to worry that this topic would be forgotten.  Obviously it
> hasn't been because you have been very busy!
>
> I will take a detailed look at your new patch this weekend and get back
> to you.  That way I can be sure to have a drink handy, per your
> instructions.:-)

JFYI: Committed revision 764315. Could not wait :)

-- 
Egor Pasko


Re: Fwd: [jira] Created: (HARMONY-6007) [drlvm][jit][abcd] classic abcd pass fixes

Posted by Egor Pasko <eg...@gmail.com>.
On the 0x572 day of Apache Harmony Naveen Neelakantam wrote:
> Hi Egor,
>
> I had started to worry that this topic would be forgotten.  Obviously it
> hasn't been because you have been very busy!
>
> I will take a detailed look at your new patch this weekend and get back
> to you.  That way I can be sure to have a drink handy, per your
> instructions.:-)

Naveen, thanks! nice to 'hear' from you.

I just could not sleep well, so, to forget this I had to do something
constructive :)

-- 
Egor Pasko


Re: Fwd: [jira] Created: (HARMONY-6007) [drlvm][jit][abcd] classic abcd pass fixes

Posted by Naveen Neelakantam <ne...@illinois.edu>.
Hi Egor,

I had started to worry that this topic would be forgotten.  Obviously it
hasn't been because you have been very busy!

I will take a detailed look at your new patch this weekend and get back
to you.  That way I can be sure to have a drink handy, per your
instructions.:-)

Naveen

On 3/10/2009 3:25 AM, Egor Pasko wrote:
> On the 0x511 day of Apache Harmony Naveen Neelakantam wrote:
>    
>>> on first implementation we checked BidirectionalBubbleSort with the
>>> current ABCD. It removed all checks. IMHO, this is a good argument to
>>> prove that the issue is fixable without introducing principially new
>>> concepts.
>>>
>>>        
>> I disagree with your assessment.  I believe that although ABCD did
>> work once, it is actually quite fragile.  My fix removes a major
>> source of this fragility.
>>
>> And as for introducing new concepts, I believe that has already
>> happened when you decided to introduce an e-SSA that has enhanced pi
>> nodes and that is used for both the upper bound and lower bound
>> problems simultaneously.  I think those were good changes based on
>> sound understanding.  I believe my change also falls into this
>> category.  Please take the time to understand what I am proposing.
>>      
>
> Naveen,
>
> sorry for the very late reply. It is not only me very busy, but also
> there had to be a lot of experimentation to find out what is
> happening.
>
> The short story is: "Your patch in HARMONY-6007 is great, awesome,
> it's operation is absolutely correct, it reveals a bunch of serious
> existing problems. I am very thankful for these. Fixing these
> problems covers all issues and adds extra benefits. So, I am going to
> stay with the new version."
>
> Now the full story, take a good seat, have a nice drink and leesen
> carefully :) I would be really glad to see comments, objections,
> advice, counterexamples, etc.
>
> Preface.
>
> I attached my patch to HARMONY-6007, please, take a look (the solver
> contains a lot of trailing space removal, sorry for that).
>
> Chapter 1. Passes.
>
> Optimization passed preceding to "classic_abcd" are:
> "memopt,dce,uce,simplify,dce,uce". I removed "hvn" since I did not
> find cases, where it becomes useful in presence of
> "memopt". "simplify" is required immediately before "classic_abcd"
> (modulo "dce,uce") because the latter relies on constants being
> folded, you can see it by running DaCapo pmd benchmark in debug mode
> using server_static pass.
>
> Chapter 2. Revealed problems.
>
> Naveen, your patch does a great job at termination stage at finding
> arraylength operands even if they are unconnected
> (i.e. unconstrained). Array length operand being unconstrained is a
> bug in "classic_abcd", so I jumped in at fixing this. The problem
> appeared in this pseudocoded situation: "call A() ->  int x;
> newarray(size=x);". The fix contains two parts: (1) handling more load
> operations that can produce arraylength operands (Op_LdStatic,
> Op_TauLdInd, Op_DirectCall, Op_TauVirtualCall, etc.) and (2) not
> renaming array length operands in chkbounds instructions so that there
> is a single instance of this arraylen to find in the solver.
>
> Meanwhile I spotted another problem with our approach. Consider the code:
> int arr[] = new int[limit];
> if (i<  limit) {
>      if (0<= i) {
>          arr[i] = 5;
>      }
> }
>
> both original version and your patched version failed to eliminate the
> check here. Just because pi renaming logic was incorrect (in fact I
> never looked into it before and relied on many assumptions about what
> in fact it does not do). And there was a bug that relied on the fact
> that Pi instructions always start the block. This is no longer the
> case when we insert tauedge() (this was probably why it broke, not
> sure). Weird, we have a lot of such crap in JIT. So, I fixed, cleaned
> that up a little. I was lazy to verify that it works right with
> nontrivial code structures. Small examples like the above worked. If
> you spot something weird with a relatively small test, I would very
> much appreciate and will fix that faster than this time.
>
> Chapter 3. Results.
>
> I ran DaCapo with both patches in server_static mode with debug build
> (all I could, not everything works, dammit). Original patch in
> HARMONY-6007 ate 9.5% checks, while the new patch ate 9.9555%. Among
> the whole run I found only 2 methods, where original patch
> outperformed the new one, but again I was lazy to spot the problem
> since the methods were quite big.
>
> Almost forgot, regression tests passed with both patches. I should
> commit them into the codebase as real regression tests. TODO.
>
> Chapter 4. Problems remaining.
>
> My solution is still "more fragile" in terms of being resistant to
> adding new opcodes. Whoever adds new opcodes should be careful about
> setting the constrained/unconstrained status on them in the inequality
> graph. I'd like to stick with this solution though, as I said, to
> minimize the amount of levels of abstraction.
>
> Another problem: memopt sucks .. at finding aliases of arraylen loads
> if we are working with fields and not with arrays on stack.
>
> Example:
> public class field {
>      public Object ss = new Object();
>      public Object[] objs = null;
>      private Object[] run(int l) {
>          objs = new Object[l];
>          for (int i = l-1; i>= 0; i--) {
>              objs[i] = ss;
>          }
>          return objs;
>      }
>      public static void main(String[] args) {
>          field t = new field();
>          System.out.println(t.run(10)[0]);
>      }
> }
>
> this is probably not killing performance in real cases because there
> is usually some external call in the loop that can potentially rewrite
> the field (using reflection, argh).
>
> If anyone finds more problems with bounds checks removal, please,
> report with tests. To find out how many bounds checks was
> detected/eliminated for each method, this DRLVM option is useful:
> -XX:jit.arg.dump_abcd_stats=true, this dumps the info in the
> ./bounds_checks.log (Though do not forget that other optimizations can
> also eliminate bounds checks as hvn,dabce,fastArrayFill).
>
> Chapter 5. Conclusions.
>
> Naveen, thank you thank you thank you for the great work! Without it I
> would have been clueless about important problem sources for a much
> longer while. Great job!
>
>