You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cxf.apache.org by Alessio Soldano <as...@redhat.com> on 2015/02/03 14:11:56 UTC

Re: JAX-WS client performances

A brief update: I've committed the workaround for http connection issue 
(thanks Dan for the help on that!) as well as some other straightforward 
optimizations on stuff that popped up while profiling.

Now, next "big" topic I found is the way we get and set properties in 
the message context. We spend a relevant amount of time in creating 
HashMap instances and especially in MessageImpl#calcContextCache, which 
copies (putAll) all the Bus, Service, Endpoint, etc. properties into the 
context cache. You can see [1] the cpu hotspot view I get currently with 
the previously mentioned test app. AFAICS in the source history, there 
used to be a different way for dealing with message properties in the 
past [2], then the cache mechanism was added. So I'm wondering if some 
kind of profiling / perf testing have been performed in the past and led 
to the changes. I might simply be testing an edge scenario, with very 
few properties being looked up and hence not justifying the caching 
mechanism.
Any comment / idea / suggestion?

Cheers
Alessio

[1] http://pasteboard.co/QgiD4Af.png
[2]

On 27/01/15 18:14, Alessio Soldano wrote:
> Hi,
> my attention has been recently brought to a scenario in which an 
> Apache CXF client invokes an endpoint operation in a loop and the 
> number of invocations performed in a given amount of time (say, 2 
> minutes) is used as benchmark for measuring WS stack performances. 
> It's actually a very simplistic scenario, with a plain JAX-WS single 
> thread client sending and receiving small RPC/Lit SOAP messages [1]. 
> The reason why I've been asked to have a look is that with default 
> settings the Apache CXF JAX-WS impl seems to perform *shamefully* bad 
> compared to the Metro (JAX-WS RI) implementation. I've been blaming 
> the user log configuration, etc but when I eventually tried on my own 
> I could actually reproduce the bad results. I've been profiling a bit 
> and found few hot spot area where CXF could possibly be optimized, but 
> the big issue really seems to be at the HTTPCounduit / 
> HTTPURLConnection level.
> I found that almost all the invocations end up into 
> sun.net.www.http.HttpClient.New(..) calling available() method [2] as 
> part of the process for re-using cached connections [3]; that goes to 
> the wire to try reading and takes a lot of time.
> When the RI does the equivalent operation, the available() method is 
> not called [4], resulting in much better performances.
> By looking at the JDK code, it looks to me that the problem boils down 
> to sun.net.www.protocol.http.HttpURLConnection#streaming() [5] 
> returning different values, as a consequence of the fixedContentLenght 
> attribute being set to a value different from -1 when running on CXF 
> only. As a matter of fact, that is set when 
> HTTPConduit.WrappedOutputStream#thresholdNotReached() is called, 
> whenever a message is completely written to the outpustream buffer 
> before the chunking threshold is reached (at least AFAIU). I've 
> searched through the JAX-WS RI and could not find any place where 
> setFixedLengthStreamingMode is called on the connection instead.
> So, I've performed two quick and dirty tries: the first time I forced 
> allowChunking = false on the client policy, the second time I 
> commented out the code in 
> HTTPConduit.WrappedOutputStream#thresholdNotReached(). In both cases I 
> managed to get performances comparable to what I can get with the 
> JAX-WS RI.
> Now, few questions:
> - are we really required to call setFixedLengthStreamingMode as we 
> currently do? what's the drawback of not calling it?
> - should we actually do something for getting decent performances by 
> default in this scenario? (not sure expecting the user to disable 
> chunking is that an option...)
> As a side note, the relevant part of the JDK HttpClient code changed 
> between JDK6 and JDK7, so things have not always been as explained 
> above...
>
> Cheers
> Alessio
>
>
> [1] http://www.fpaste.org/176166/14223765/
> [2] http://pasteboard.co/FR5QVrP.png
> [3] 
> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7u40-b43/sun/net/www/http/HttpClient.java#276
> [4] http://pasteboard.co/FR8okYM.png
> [5] 
> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7u40-b43/sun/net/www/protocol/http/HttpURLConnection.java#HttpURLConnection.streaming%28%29
>


-- 
Alessio Soldano
Web Service Lead, JBoss

Re: JAX-WS client performances

Posted by Alessio Soldano <as...@redhat.com>.

Hi Dan,
thanks for the feedback.

On 05/02/15 23:01, Daniel Kulp wrote:
> 1) Can you save/print a list of everything that is a “get” from the non-message contexts? I’m mostly concerned about Bus/BindingOperation/Endpoint objects. A long long time ago (2.0/2.1 timeframe), those objects were stored as contextual properties and the only way to retrieve them was via the “getContextualProperty” things. Since we use those all over the place, I added direct accessors to the Exchange, but for compatibility still stored them as properties. However, I don’t remember if I went and found all the places that used to look those up via the “get” methods and replaced them. We probably could remove those from being stored as contextual properties and document that. (should have been done long ago) Would reduce the number of properties to deal with.
I've added few simple System.out in the get method of "array" version
and run the jbossws testsuite. No get of Bus/BindingOperation/Endpoint
objects afaics; so I've tried grepping getContextualProperty in the
whole cxf sources and the only thing I could find is
org/apache/cxf/rs/security/saml/sso/AbstractRequestAssertionConsumerHandler
getting the Bus from the context.
I'm sending you the zip with the logs.

> 2) In the “array” version, you could loop through once and just do an == compare and then only do the .equals if that ends up null. I’m willing to bet most of the strings we use as keys are interned.
Tried that, but it's not improving performances. To be honest, though,
my bench app is likely not stressing the 'get' that much, as it doesn't
use WS-Security, etc.
Anyway, my gut feeling is that's also because the cache misses are by
far much more then the cache hits (for instance, on client side in the
jbossws testsuite I see ~5700 props found in the message context, ~350
props found in non-message contexts and ~72000 props not found.

Cheers
Alessio

--
Alessio Soldano
Web Service Lead, JBoss