You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by bu...@apache.org on 2014/09/11 07:51:38 UTC

[Bug 56953] New: A improvement for "DataInputStream"

https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

            Bug ID: 56953
           Summary: A improvement for "DataInputStream"
           Product: Tomcat 7
           Version: trunk
          Hardware: PC
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Catalina
          Assignee: dev@tomcat.apache.org
          Reporter: hzhang9@ebay.com

Created attachment 31993
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=31993&action=edit
patch

The method "readUnsignedShort()" of "DataInputStream" read the stream twice to
get the unsigned short. This happens even if the "BufferedInputStream" is
invoked.

    public final int readUnsignedShort() throws IOException {
        int ch1 = in.read();
        int ch2 = in.read();
        if ((ch1 | ch2) < 0)
            throw new EOFException();
        return (ch1 << 8) + (ch2 << 0);
    }

It may cause extra cost for some boundary processing. This case also appears in
"readInt", "readChar", etc. It is obvious in some large projects.

Use a interface to replace "DataInputStream" by "FastDataInputStream" can
bypass these processes. The "FastDataInputStream" gets the bytes from buffer
directly.

//========== method in FastDataInputStream. ==========
    public int readUnsignedShort() throws IOException{

        if(pos + 1 >= cnt) {
            fill();
            if(pos + 1 >= cnt) throw new EOFException();
        }

        int ch1 = this.buf[pos++] & 0xff;
        int ch2 = this.buf[pos++] & 0xff;
        return (ch1 << 8) + (ch2 << 0);
    }

Benefit shows bellow, it is got from the test case in attachment.
=====lots of jar files=====
DataInputStream: 592
FastDataInputStream: 488
=====few jar files=====
DataInputStream: 93
FastDataInputStream: 77

notice: The optimized method is called before original method in the test case,
so the real result should be more obvious.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

--- Comment #6 from Konstantin Kolinko <kn...@gmail.com> ---
The point of java.io.BufferedInputStream() is that is.read() were fast enough.
I do not see much benefit in re-implementing standard JRE classes.

By the way, is.read() is a blocking method. If you were to use fill() or
is.read(chars[]) there is no guarantee of how many bytes they can actually read
in a single call.

That is why DataInput.readFully() method is there.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

--- Comment #5 from Mark Thomas <ma...@apache.org> ---
If you continue to ignore the comments you are given then this issue is going
to get closed as WONTFIX.

Your first patch was 86k and full of irrelevant changes. The second attempt is
worse at 107k.

Starting at the beginning of the patch:
1. The first chunk changes restores an svn keyword the Tomcat team previously
removed. This change has nothing to do with this issue and should not be in
this patch.

2. The second chunk reverts a fix to a constant name the Tomcat team previously
fixed and removes some code necessary for Java 8 support. This change has
nothing to do with this issue and should not be in this patch. Further this
change breaks Java 8 support.

3. The third chunk makes further changes that have nothing to do with this
issue and further breaks Java 8 support.

And so on.

I have no problem with reviewing a patch that completely replaces
DataInputStream because of final methods. It is all the other irrelevant,
breaking changes in the patch that are the problem.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

--- Comment #11 from hzhang9@ebay.com ---
Created attachment 32030
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=32030&action=edit
FastDataInputStream implementation

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

--- Comment #16 from Mark Thomas <ma...@apache.org> ---
Patch has been applied to 7.0.x for 7.0.56 onwards.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

--- Comment #7 from Mark Thomas <ma...@apache.org> ---
With the various other changes and improvements back-ported to 7.0.x, I'm
prepared to consider this idea if the benefit justifies it. That said, we need
a patch that implements this idea - and just this idea - first.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

hzhang9@ebay.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #32008|0                           |1
        is obsolete|                            |

--- Comment #10 from hzhang9@ebay.com ---
Created attachment 32029
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=32029&action=edit
change DataInputStream to DataInput

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

--- Comment #12 from hzhang9@ebay.com ---
Created attachment 32031
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=32031&action=edit
Test case

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

hzhang9@ebay.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #31993|0                           |1
           is patch|                            |

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

--- Comment #15 from Konstantin Kolinko <kn...@gmail.com> ---
OK, this is better.

1. Formatting: the code shall not use tab characters

2. In "skipBytes(int n)":  there is no reason to call "fillNew()" after calling
"in.skip(n - sum)" on the underlying stream. If another skip call follows then
there is no point in filling the buffer.

3. "<< 0" shift operation is NOOP and can be removed.

4. I wonder whether "ch + ch" or "ch | ch" works better. In theory the latter
should be faster, but I guess there is no measurable difference nowadays.

5. In uninmplemented readLine() method: maybe better throw new
java.lang.UnsupportedOperationException() instead of IOException.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

--- Comment #8 from hzhang9@ebay.com ---
OK, and I'm working on this.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

--- Comment #18 from hzhang9@ebay.com ---
Thanks for your advice.
These points are extremely wise.

(In reply to Konstantin Kolinko from comment #15)
> OK, this is better.
> 
> 1. Formatting: the code shall not use tab characters
> 
> 2. In "skipBytes(int n)":  there is no reason to call "fillNew()" after
> calling "in.skip(n - sum)" on the underlying stream. If another skip call
> follows then there is no point in filling the buffer.
> 
> 3. "<< 0" shift operation is NOOP and can be removed.
> 
> 4. I wonder whether "ch + ch" or "ch | ch" works better. In theory the
> latter should be faster, but I guess there is no measurable difference
> nowadays.
> 
> 5. In uninmplemented readLine() method: maybe better throw new
> java.lang.UnsupportedOperationException() instead of IOException.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

--- Comment #1 from hzhang9@ebay.com ---
Created attachment 31994
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=31994&action=edit
test case

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

--- Comment #9 from Konstantin Kolinko <kn...@gmail.com> ---
A large portion of "your" code is apparently borrowed from some software that
uses GPL license. You cannot contribute such code to an Apache project.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

--- Comment #13 from hzhang9@ebay.com ---
Sorry for the delay.
New patch applies on TRUNK, and the codes borrowed from GPL has been changed.

Benefit shows bellow(300 jar files involved in this test):
=====Call FDIS first=====
DataInputStream: 7342
FastDataInputStream: 6967

=====reverse call sequency=====
DataInputStream: 7369
FastDataInputStream: 6979

The benefit is considerable when there are 200 or more jar files need to be
parse.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

hzhang9@ebay.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Blocks|                            |56955

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

--- Comment #14 from Mark Thomas <ma...@apache.org> ---
Thanks for the updated patches. I see a 20-25% improvement with the patch so it
has been applied to 8.0.x for 8.0.13 onwards.

I'll look into porting it to 7.0.x.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

Mark Thomas <ma...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|FIXED                       |WONTFIX

--- Comment #21 from Mark Thomas <ma...@apache.org> ---
This fix has been reverted from trunk, 8.0.x (for 8.0.16 onwards) and 7.0.x
(for 7.0.58 onwards) and will not be reapplied.

Once the issues causing the regressions found so far are fixed, the performance
improvement is only a few percent. It simply isn't worth the risk of further
regressions to shave a few percent of the scanning time at application start.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

Mark Thomas <ma...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 OS|                            |All

--- Comment #2 from Mark Thomas <ma...@apache.org> ---
The patch is no good. It includes a whole bunch of changes unrelated to this
issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

Mark Thomas <ma...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #17 from Mark Thomas <ma...@apache.org> ---
Review comments applied.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

--- Comment #19 from Konstantin Kolinko <kn...@gmail.com> ---
Note that there was a regression caused by this change - bug 57173.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

hzhang9@ebay.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #31993|0                           |1
        is obsolete|                            |
  Attachment #31994|0                           |1
        is obsolete|                            |

--- Comment #3 from hzhang9@ebay.com ---
Created attachment 32008
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=32008&action=edit
patch

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

--- Comment #20 from hzhang9@ebay.com ---
(In reply to Konstantin Kolinko from comment #19)
> Note that there was a regression caused by this change - bug 57173.

I checked the case, and the problem does exist because I didn't implement
several method on it.
I'll try to fix it later.

The suggestion to roll back the change can be adopted temporarily.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 56953] A improvement for "DataInputStream"

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=56953

--- Comment #4 from hzhang9@ebay.com ---
(In reply to Mark Thomas from comment #2)
> The patch is no good. It includes a whole bunch of changes unrelated to this
> issue.
It is indeed a problem.
The change should be simpler than it shows in the patch, we just need to design
a new stream for the parse. 

But the methods in "DataInputStream" are always be "final", so I can only use a
interface to implement it.

Do you have any suggestion for this situation?

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org