You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by Brett Elliott <be...@icr-team.com> on 2020/09/29 17:02:48 UTC

RE: [EXTERNAL] Re: cpp thin client vector resize

Hi Igor,

Your request to see my Read method made me examine it a little closer. The particular vector resize that's the problem is my own fault. Thanks for making me have a look at that again.

Thanks,
Brett

-----Original Message-----
From: Igor Sapego [mailto:isapego@apache.org] 
Sent: Tuesday, September 29, 2020 2:22 AM
To: dev <de...@ignite.apache.org>
Subject: [EXTERNAL] Re: cpp thin client vector resize

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.


Hi,

Can you share your ignite::binary::BinaryType::Read method where reading of the std::vector is going on?

Also, are your strings large too or only vectors?

Best Regards,
Igor


On Mon, Sep 28, 2020 at 8:29 PM Brett Elliott <be...@icr-team.com> wrote:

> Hello,
>
> Tl;dr: I'm doing some profiling, and the cpp thin client is spending a 
> lot of time in memset. It's spending almost as much time as in the 
> socket recv call.
>
> Longer version:
>
> I was profiling a test harness that does a Get from our ingite grid. 
> The test harness is written in c++ using the thin client. I profiled 
> the code using gperftools, and I found that memset (__memset_sse2 in 
> my case) was taking a large portion of the execution time. I'm 
> Get-ting a BinaryType which contains a std::string, an int32_t, and an 
> array of int8_t. For my test case, the array of int8_t values can 
> vary, but I got the best throughput on my machine using about 100MB for the size of that array.
>
> I profiled the test code doing a single Get, and doing 8 Gets. In the 
> 8 Get case, the number of memset calls increased, but the percentage 
> of overall time spent in memset was reduced. However it was not 
> reduced as much as I'd hoped. I was hoping that the first Get call 
> would have a large memset, and the rest of the Get calls would skip 
> it, but that's maybe not the case.
>
> I'm seeing almost as much time spent in memset as is being spent in 
> recv (__libc_recv in this case). That seems like a lot of time spent 
> initializing. I suspect that it's std::vector initialization caused by 
> resize.
>
> I believe memset is being invoked by a std::vector::resize operation 
> inside of ignite::binary::BinaryType::Read. I believe the source file 
> is 
> modules/platforms/cpp/binary/src/impl/binary/binary_reader_impl.cpp. 
> In the code I'm looking at it's line 905. There are only two calls to 
> resize in this sourcefile, and it's the one in ReadTopObject0 which I 
> think is the culprit. I didn't compile with debug symbols to confirm 
> the particular resize call, but my profiler's callstack shows that resize is to blame for all the memset calls.
>
> Is there any way we can avoid std::vector::resize? I suspect that 
> ultimately the problem is that the buffer somewhere gets passed to a 
> socket recv call, and recv call takes a pointer and length. In that 
> case, there's no way that I know of to use a std::vector for the 
> buffer and avoid the unnecessary initialization/memset in the resize call.
>
> Could another container be used instead of a vector?
> Could the vector be reused, so on subsequent calls we don't need to 
> resize it again?
> Could something like uvector (which skips initialization) be used instead?
>
> Thank,
> Brett
>
>
>

Re: [EXTERNAL] Re: cpp thin client vector resize

Posted by Igor Sapego <is...@apache.org>.
No problem, feel free to write again if you encounter any issues.

Best Regards,
Igor


On Tue, Sep 29, 2020 at 8:02 PM Brett Elliott <be...@icr-team.com> wrote:

> Hi Igor,
>
> Your request to see my Read method made me examine it a little closer. The
> particular vector resize that's the problem is my own fault. Thanks for
> making me have a look at that again.
>
> Thanks,
> Brett
>
> -----Original Message-----
> From: Igor Sapego [mailto:isapego@apache.org]
> Sent: Tuesday, September 29, 2020 2:22 AM
> To: dev <de...@ignite.apache.org>
> Subject: [EXTERNAL] Re: cpp thin client vector resize
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you recognize the sender and know
> the content is safe.
>
>
> Hi,
>
> Can you share your ignite::binary::BinaryType::Read method where reading
> of the std::vector is going on?
>
> Also, are your strings large too or only vectors?
>
> Best Regards,
> Igor
>
>
> On Mon, Sep 28, 2020 at 8:29 PM Brett Elliott <be...@icr-team.com>
> wrote:
>
> > Hello,
> >
> > Tl;dr: I'm doing some profiling, and the cpp thin client is spending a
> > lot of time in memset. It's spending almost as much time as in the
> > socket recv call.
> >
> > Longer version:
> >
> > I was profiling a test harness that does a Get from our ingite grid.
> > The test harness is written in c++ using the thin client. I profiled
> > the code using gperftools, and I found that memset (__memset_sse2 in
> > my case) was taking a large portion of the execution time. I'm
> > Get-ting a BinaryType which contains a std::string, an int32_t, and an
> > array of int8_t. For my test case, the array of int8_t values can
> > vary, but I got the best throughput on my machine using about 100MB for
> the size of that array.
> >
> > I profiled the test code doing a single Get, and doing 8 Gets. In the
> > 8 Get case, the number of memset calls increased, but the percentage
> > of overall time spent in memset was reduced. However it was not
> > reduced as much as I'd hoped. I was hoping that the first Get call
> > would have a large memset, and the rest of the Get calls would skip
> > it, but that's maybe not the case.
> >
> > I'm seeing almost as much time spent in memset as is being spent in
> > recv (__libc_recv in this case). That seems like a lot of time spent
> > initializing. I suspect that it's std::vector initialization caused by
> > resize.
> >
> > I believe memset is being invoked by a std::vector::resize operation
> > inside of ignite::binary::BinaryType::Read. I believe the source file
> > is
> > modules/platforms/cpp/binary/src/impl/binary/binary_reader_impl.cpp.
> > In the code I'm looking at it's line 905. There are only two calls to
> > resize in this sourcefile, and it's the one in ReadTopObject0 which I
> > think is the culprit. I didn't compile with debug symbols to confirm
> > the particular resize call, but my profiler's callstack shows that
> resize is to blame for all the memset calls.
> >
> > Is there any way we can avoid std::vector::resize? I suspect that
> > ultimately the problem is that the buffer somewhere gets passed to a
> > socket recv call, and recv call takes a pointer and length. In that
> > case, there's no way that I know of to use a std::vector for the
> > buffer and avoid the unnecessary initialization/memset in the resize
> call.
> >
> > Could another container be used instead of a vector?
> > Could the vector be reused, so on subsequent calls we don't need to
> > resize it again?
> > Could something like uvector (which skips initialization) be used
> instead?
> >
> > Thank,
> > Brett
> >
> >
> >
>