You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by Вадим Опольский <va...@gmail.com> on 2017/03/09 08:20:49 UTC

Re: IGNITE-13 (ready for review)

Hello everyone!

Colleagues, take a look please at the results of measuring.

Can I close this ticket ?

Should I add JMH benchmark and unit test to Ignite project ?

Results of measuring
https://github.com/javaller/mybenchmark/blob/master/out.txt

Benchmark
https://github.com/javaller/mybenchmark/blob/master/src/main/java/org/sample/ExampleTest.java

UTest
https://github.com/javaller/mybenchmark/blob/master/src/main/java/org/sample/BinaryMarshallerSelfTest.java

*results of measuring*
Benchmark
(message)                                              Mode  Cnt    Score
Error  Units
LatchBenchmark.binaryHeapOutputStreamDirect
TestTestTestTestTestTestTestTestTest  avgt   50  128,036 ± 4,360  ns/op
LatchBenchmark.binaryHeapOutputStreamDirect
TestTestTest                    avgt   50    44,934 ± 1,463  ns/op
LatchBenchmark.binaryHeapOutputStreamDirect
Test                          avgt   50    21,254 ± 0,776  ns/op
LatchBenchmark.binaryHeapOutputStreamInDirect
TestTestTestTestTestTestTestTestTest avgt   50    83,262 ± 2,264  ns/op
LatchBenchmark.binaryHeapOutputStreamInDirect
TestTestTest                   avgt   50    58,975 ± 1,559  ns/op
LatchBenchmark.binaryHeapOutputStreamInDirect
Test                         avgt   50    48,506 ± 1,116  ns/op


Vadim

2017-03-06 19:42 GMT+03:00 Вадим Опольский <va...@gmail.com>:

> Hello, everybody!
>
> Valentin, I've corrected benchmark and received the results:
>
> Benchmark
> (message)                                              Mode  Cnt
> Score   Error  Units
> LatchBenchmark.binaryHeapOutputStreamDirect
> TestTestTestTestTestTestTestTestTest  avgt   50  128,036 ± 4,360  ns/op
> LatchBenchmark.binaryHeapOutputStreamDirect
> TestTestTest                    avgt   50    44,934 ± 1,463  ns/op
> LatchBenchmark.binaryHeapOutputStreamDirect
> Test                          avgt   50    21,254 ± 0,776  ns/op
> LatchBenchmark.binaryHeapOutputStreamInDirect
> TestTestTestTestTestTestTestTestTest avgt   50    83,262 ± 2,264  ns/op
> LatchBenchmark.binaryHeapOutputStreamInDirect
> TestTestTest                   avgt   50    58,975 ± 1,559  ns/op
> LatchBenchmark.binaryHeapOutputStreamInDirect
> Test                         avgt   50    48,506 ± 1,116  ns/op
>
> https://github.com/javaller/MyBenchmark/blob/master/out_06_03_17_2.txt
>
> Whats the next step ?
>
>  Do I have to add benchmark to Ignite project ?
>
> Vadim Opolskiy
>
> 2017-03-03 21:11 GMT+03:00 Valentin Kulichenko <
> valentin.kulichenko@gmail.com>:
>
>> Hi Vadim,
>>
>> What do you mean by "copied benchmarks"? What changed singe previous
>> iteration and why results are so different?
>>
>> As for duplicated loop, you don't need it. BinaryOutputStream allows to
>> write a value to a particular position (even before already written data).
>> So you can reserve 4 bytes for length, remember position, calculate length
>> while encoding and writing bytes, and then write length.
>>
>> -Val
>>
>> On Fri, Mar 3, 2017 at 12:45 AM, Вадим Опольский <va...@gmail.com>
>> wrote:
>>
>>> Valentin,
>>>
>>> What do you think about duplicated cycle in strToBinaryOutputStream ?
>>>
>>> How to calculate StrLen для outBinaryHeap without this cycle ?
>>>
>>> public class BinaryUtilsNew extends BinaryUtils {
>>>
>>>     public static int getStrLen(String val) {
>>>         int strLen = val.length();
>>>         int utfLen = 0;
>>>         int c;
>>>
>>>         // Determine length of resulting byte array.
>>>
>>>
>>>
>>>
>>> *for (int cnt = 0; cnt < strLen; cnt++) {            c = val.charAt(cnt);            if (c >= 0x0001 && c <= 0x007F)*                utfLen++;
>>>        *     else if (c > 0x07FF)*
>>>                 utfLen += 3;
>>>             else
>>>                 utfLen += 2;
>>>         }
>>>
>>>         return utfLen;
>>>     }
>>>
>>>     public static void strToUtf8BytesDirect(BinaryOutputStream outBinaryHeap, String val) {
>>>
>>>         int strLen = val.length();
>>>         int c, cnt;
>>>
>>>         int position = 0;
>>>
>>>         outBinaryHeap.unsafeEnsure(1 + 4);
>>>
>>> *   outBinaryHeap.unsafeWriteByte(GridBinaryMarshaller.STRING);        outBinaryHeap.unsafeWriteInt(getStrLen(val));*
>>>
>>>
>>>
>>> * for (cnt = 0; cnt < strLen; cnt++) {            c = val.charAt(cnt);*
>>>        *     if (c >= 0x0001 && c <= 0x007F)*
>>>                 outBinaryHeap.writeByte((byte) c);
>>>          *   else if (c > 0x07FF) {*
>>>                 outBinaryHeap.writeByte((byte)(0xE0 | (c >> 12) & 0x0F));
>>>                 outBinaryHeap.writeByte((byte)(0x80 | (c >> 6) & 0x3F));
>>>                 outBinaryHeap.writeByte((byte)(0x80 | (c & 0x3F)));
>>>             }
>>>             else {
>>>                 outBinaryHeap.writeByte((byte)(0xC0 | ((c >> 6) & 0x1F)));
>>>                 outBinaryHeap.writeByte((byte)(0x80 | (c  & 0x3F)));
>>>             }
>>>         }
>>>     }
>>>
>>>
>>> Vadim
>>>
>>>
>>>
>>> 2017-03-03 2:00 GMT+03:00 Valentin Kulichenko <
>>> valentin.kulichenko@gmail.com>:
>>>
>>>> Vadim,
>>>>
>>>> Looks better now. Can you also try to modify the benchmark so that
>>>> marshaller and writer are created outside of the measured method? I.e. the
>>>> benchmark methods should be as simple as this:
>>>>
>>>>     @Benchmark
>>>>     public void binaryHeapOutputStreamDirect() throws Exception {
>>>>         writer.doWriteStringDirect(message);
>>>>     }
>>>>
>>>>     @Benchmark
>>>>     public void binaryHeapOutputStreamInDirect() throws Exception {
>>>>         writer.doWriteString(message);
>>>>     }
>>>>
>>>> In any case, do I understand correctly that it didn't actually make any
>>>> performance difference? If so, I think we can close the ticket.
>>>>
>>>> Vova, can you also take a look and provide your thoughts?
>>>>
>>>> -Val
>>>>
>>>> On Thu, Mar 2, 2017 at 1:27 PM, Вадим Опольский <va...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Valentin!
>>>>>
>>>>> I've created:
>>>>>
>>>>> new method strToUtf8BytesDirect in BinaryUtilsNew
>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>> /java/org/sample/BinaryUtilsNew.java
>>>>>
>>>>> new method doWriteStringDirect in BinaryWriterExImplNew
>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>> /java/org/sample/BinaryWriterExImplNew.java
>>>>>
>>>>> benchmarks for BinaryWriterExImpl doWriteString and
>>>>> BinaryWriterExImplNew  doWriteStringDirect
>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>> /java/org/sample/ExampleTest.java
>>>>>
>>>>> This is a result of comparing:
>>>>>
>>>>> Benchmark
>>>>> Mode  Cnt   Score               Error         UnitsExampleTest.binaryHeapOutputStreamDirect
>>>>> avgt   50  1128448,743 ± 13536,689  ns/opExampleTest.binaryHeapOutputStreamInDirect
>>>>> avgt   50  1127270,695 ± 17309,256  ns/op
>>>>>
>>>>> Vadim
>>>>>
>>>>> 2017-03-02 1:02 GMT+03:00 Valentin Kulichenko <
>>>>> valentin.kulichenko@gmail.com>:
>>>>>
>>>>>> Hi Vadim,
>>>>>>
>>>>>> We're getting closer :) I would actually like to see the test for
>>>>>> actual implementation of BinaryWriterExImpl#doWriteString method.
>>>>>> Logic in binaryHeapOutputInDirect() confuses me a bit and I'm not sure
>>>>>> comparison is valid.
>>>>>>
>>>>>> Can you please do the following:
>>>>>>
>>>>>> 1. Create new BinaryUtils#strToUtf8BytesDirect method, copy-paste
>>>>>> the code from existing BinaryUtils#strToUtf8Bytes and modify it so that it
>>>>>> takes BinaryOutputStream as an argument and writes to it directly. Do not
>>>>>> create stream inside this method, as it's the same as creating new array.
>>>>>> 2. Create new BinaryWriterExImpl#doWriteStringDirect, copy-paste the
>>>>>> code from existing BinaryWriterExImpl#doWriteString and modify it so
>>>>>> that it uses BinaryUtils#strToUtf8BytesDirect and doesn't
>>>>>> call out.writeByteArray.
>>>>>> 3. Create benchmark for BinaryWriterExImpl#doWriteString method.
>>>>>> I.e., create an instance of BinaryWriterExImpl and call doWriteString() in
>>>>>> benchmark method.
>>>>>> 4. Similarly, create benchmark for BinaryWriterExImpl#doWriteStri
>>>>>> ngDirect.
>>>>>> 5. Compare results.
>>>>>>
>>>>>> This will give us clear picture of how these two approaches perform.
>>>>>> Your current results are actually promising, but I would like to confirm
>>>>>> them.
>>>>>>
>>>>>> -Val
>>>>>>
>>>>>> On Wed, Mar 1, 2017 at 6:17 AM, Вадим Опольский <vaopolskij@gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> Hi Valentin!
>>>>>>>
>>>>>>> Thank you for comments.
>>>>>>>
>>>>>>> There is a new method which writes directly to BinaryOutputStream
>>>>>>> instead of intermediate array.
>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>> /java/org/sample/BinaryUtilsNew.java
>>>>>>>
>>>>>>> There is benchmark.
>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>> /java/org/sample/MyBenchmark.java
>>>>>>>
>>>>>>> Unit test
>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>> /java/org/sample/BinaryOutputStreamTest.java
>>>>>>>
>>>>>>> Statistics
>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/out_01_03_17.txt
>>>>>>>
>>>>>>> Benchmark
>>>>>>>  Mode       Cnt    Score        Error  Units MyBenchmark.binaryHeapOutputIn
>>>>>>> Direct            avgt          50  111,337 ± 0,742  ns/op
>>>>>>> MyBenchmark.binaryHeapOutputStreamDirect   avgt          50
>>>>>>> 23,847 ± 0,303    ns/op
>>>>>>>
>>>>>>>
>>>>>>> Vadim
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2017-02-28 4:29 GMT+03:00 Valentin Kulichenko <
>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>
>>>>>>>> Hi Vadim,
>>>>>>>>
>>>>>>>> Looks like you accidentally removed dev list from the thread,
>>>>>>>> adding it back.
>>>>>>>>
>>>>>>>> I think there is still misunderstanding. What I propose is to
>>>>>>>> modify the BinaryUtils#strToUtf8Bytes so that it writes directly to BinaryOutputStream
>>>>>>>> instead of intermediate array. This should decrease memory consumption and
>>>>>>>> can also increase performance as we will avoid 'writeByteArray'
>>>>>>>> step at the end.
>>>>>>>>
>>>>>>>> Does it make sense to you?
>>>>>>>>
>>>>>>>> -Val
>>>>>>>>
>>>>>>>> On Mon, Feb 27, 2017 at 6:55 AM, Вадим Опольский <
>>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi, Valentin!
>>>>>>>>>
>>>>>>>>> What do you think about using the methods of BinaryOutputStream:
>>>>>>>>>
>>>>>>>>> 1) writeByteArray(byte[] val)
>>>>>>>>> 2) writeCharArray(char[] val)
>>>>>>>>> 3) write (byte[] arr, int off, int len)
>>>>>>>>>
>>>>>>>>> String val = "Test";
>>>>>>>>>     out.writeByteArray( val.getBytes(UTF_8));
>>>>>>>>>
>>>>>>>>>  String val = "Test";
>>>>>>>>>     out.writeCharArray(str.toCharArray());
>>>>>>>>>
>>>>>>>>> String val = "Test"
>>>>>>>>> InputStream stream = new ByteArrayInputStream(
>>>>>>>>> exampleString.getBytes(StandartCharsets.UTF_8));
>>>>>>>>> byte[] buffer = new byte[1024];
>>>>>>>>> while ((buffer = stream.read()) != -1) {
>>>>>>>>> out.writeByteArray(buffer);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> What else can we use ?
>>>>>>>>>
>>>>>>>>> Vadim
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2017-02-25 2:21 GMT+03:00 Valentin Kulichenko <
>>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>>
>>>>>>>>>> Hi Vadim,
>>>>>>>>>>
>>>>>>>>>> Which method implements the approach described in the ticket?
>>>>>>>>>> From what I see, all writeToStringX versions are still encoding into an
>>>>>>>>>> intermediate array and then call out.writeByteArray. What we need to test
>>>>>>>>>> is the approach where bytes are written directly into the stream during
>>>>>>>>>> encoding. Encoding algorithm itself should stay the same for now, otherwise
>>>>>>>>>> we will not know how to interpret the result.
>>>>>>>>>>
>>>>>>>>>> It looks like there is some misunderstanding here, so please let
>>>>>>>>>> me know anything is still unclear. I will be happy to answer your questions.
>>>>>>>>>>
>>>>>>>>>> -Val
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 22, 2017 at 7:22 PM, Valentin Kulichenko <
>>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Vadim,
>>>>>>>>>>>
>>>>>>>>>>> Thanks, I will review this week.
>>>>>>>>>>>
>>>>>>>>>>> -Val
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 22, 2017 at 2:28 AM, Вадим Опольский <
>>>>>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Valentin!
>>>>>>>>>>>>
>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>>>>>>
>>>>>>>>>>>> I created BinaryWriterExImplNew (extended of BinaryWriterExImpl) and
>>>>>>>>>>>> added new methods with changes described in the ticket
>>>>>>>>>>>>
>>>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>>>>> /java/org/sample/BinaryWriterExImplNew.java
>>>>>>>>>>>>
>>>>>>>>>>>> I created a benchmark for BinaryWriterExImplNew
>>>>>>>>>>>>
>>>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>>>>> /java/org/sample/ExampleTest.java
>>>>>>>>>>>>
>>>>>>>>>>>> I run benchmark and compared results
>>>>>>>>>>>>
>>>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/totalsta
>>>>>>>>>>>> t.txt
>>>>>>>>>>>>
>>>>>>>>>>>> # Run complete. Total time: 00:10:24
>>>>>>>>>>>> Benchmark                                    Mode  Cnt
>>>>>>>>>>>> Score       Error  Units
>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream1          avgt   50
>>>>>>>>>>>> 1114999,207 ± 16756,776  ns/op
>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream2          avgt   50
>>>>>>>>>>>> 1118149,320 ± 17515,961  ns/op
>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream3          avgt   50
>>>>>>>>>>>> 1113678,657 ± 17652,314  ns/op
>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream4          avgt   50
>>>>>>>>>>>> 1112415,051 ± 18273,874  ns/op
>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream5          avgt   50
>>>>>>>>>>>> 1111366,583 ± 18282,829  ns/op
>>>>>>>>>>>> ExampleTest.binaryHeapOutputStreamACSII   avgt   50
>>>>>>>>>>>> 1112079,667 ± 16659,532  ns/op
>>>>>>>>>>>> ExampleTest.binaryHeapOutputStreamUTFCustom  avgt   50
>>>>>>>>>>>> 1114949,759 ± 16809,669  ns/op
>>>>>>>>>>>> ExampleTest.binaryHeapOutputStreamUTFNIO        avgt   50
>>>>>>>>>>>> 1121462,325 ± 19836,466  ns/op
>>>>>>>>>>>>
>>>>>>>>>>>> Is it OK? Whats the next step? Do I have to move this
>>>>>>>>>>>> JMH benchmark to the Ignite project ?
>>>>>>>>>>>>
>>>>>>>>>>>> Vadim Opolski
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko <
>>>>>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Vadim,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm not sure I understand your benchmarks and how they verify
>>>>>>>>>>>>> the optimization discussed here. Basically, here is what needs to be done:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1. Create a benchmark for BinaryWriterExImpl#doWriteString
>>>>>>>>>>>>> method.
>>>>>>>>>>>>> 2. Run the benchmark with current implementation.
>>>>>>>>>>>>> 3. Make the change described in the ticket.
>>>>>>>>>>>>> 4. Run the benchmark with these changes.
>>>>>>>>>>>>> 5. Compare results.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Makes sense? Let me know if anything is unclear.
>>>>>>>>>>>>>
>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский <
>>>>>>>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hello everybody!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Valentin, I just have finished benchmark (with JMH) -
>>>>>>>>>>>>>> https://github.com/javaller/MyBenchmark.git
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It collect data about time working of serialization.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For instance - https://github.com/javaller/My
>>>>>>>>>>>>>> Benchmark/blob/master/out200217.txt
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> To start it you have to do next:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1) clone it - git colne https://github.com/javal
>>>>>>>>>>>>>> ler/MyBenchmark.git
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2) install it - mvn install
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 3) run benchmarks -  java -Xms1024m -Xmx4096m -jar
>>>>>>>>>>>>>> target\benchmarks.jar
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Vadim Opolski
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko <
>>>>>>>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Vladimir,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I think we misunderstood each other. My understanding of
>>>>>>>>>>>>>>> this optimization is the following.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Currently string serialization is done in two steps (see
>>>>>>>>>>>>>>> BinaryWriterExImpl#doWriteString):
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string
>>>>>>>>>>>>>>> into byte array.
>>>>>>>>>>>>>>> out.writeByteArray(strArr);                      // Write
>>>>>>>>>>>>>>> byte array into stream.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What this ticket suggests is to write directly into stream
>>>>>>>>>>>>>>> while string is encoded, without intermediate array. This both reduces
>>>>>>>>>>>>>>> memory consumption and eliminates array copy step.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I updated the ticket and added this explanation there.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Vadim, can you create a micro benchmark and check if it
>>>>>>>>>>>>>>> gives any improvement?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov <
>>>>>>>>>>>>>>> vozerov@gridgain.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It is hard to say whether it makes sense or not. No doubt,
>>>>>>>>>>>>>>>> it could speed up marshalling process at the cost of 2x memory required for
>>>>>>>>>>>>>>>> strings. From my previous experience with marshalling micro-optimizations,
>>>>>>>>>>>>>>>> we will hardly ever notice speedup in distributed environment.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> But, there is another sied - it could speedup our queries,
>>>>>>>>>>>>>>>> because we will not have to unmarshal string on every field access. So I
>>>>>>>>>>>>>>>> would try to make this optimization optional and then measure query
>>>>>>>>>>>>>>>> performance with classes having lots of strings. It could give us
>>>>>>>>>>>>>>>> interesting results.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko <
>>>>>>>>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Vladimir,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Can you please take a look and provide your thoughts? Can
>>>>>>>>>>>>>>>>> this be applied to binary marshaller? From what I recall, it serializes
>>>>>>>>>>>>>>>>> string a bit differently from optimized marshaller, so I'm not sure.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan <
>>>>>>>>>>>>>>>>> dsetrakyan@apache.org> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko <
>>>>>>>>>>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> > Hi Vadim,
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > I don't think it makes much sense to invest into
>>>>>>>>>>>>>>>>>> OptimizedMarshaller.
>>>>>>>>>>>>>>>>>> > However, I would check if this optimization is
>>>>>>>>>>>>>>>>>> applicable to
>>>>>>>>>>>>>>>>>> > BinaryMarshaller, and if yes, implement it.
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Val, in this case can you please update the ticket?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > -Val
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский <
>>>>>>>>>>>>>>>>>> vaopolskij@gmail.com>
>>>>>>>>>>>>>>>>>> > wrote:
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > > Dear sirs!
>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>> > > I want to resolve issue IGNITE-13 -
>>>>>>>>>>>>>>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>> > > Is it actual?
>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>> > > Vadim Opolski
>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: IGNITE-13 (ready for review)

Posted by Вадим Опольский <va...@gmail.com>.
Hi Valentin,

OK, thank you.

2017-03-11 20:33 GMT+03:00 Valentin Kulichenko <
valentin.kulichenko@gmail.com>:

> Hi Vadim,
>
> According to these results, I don't think it makes sense to make this
> change in the product. I closed the ticket.
>
> -Val
>
> On Fri, Mar 10, 2017 at 12:28 PM, Вадим Опольский <va...@gmail.com>
> wrote:
>
>> Hi Valentin,
>>
>> I cant find out why it's better on long string, but worse on a short
>> string. May be it needs to use special tools like such as oracle solaris
>> studio performance analyzer or other.
>>
>> I've added links to the benchmark, unit test and results to the ticket
>> and switched status to patch available.
>>
>> Vadim Opolski
>>
>>
>>
>> 2017-03-09 13:57 GMT+03:00 Valentin Kulichenko <
>> valentin.kulichenko@gmail.com>:
>>
>>> Hi Vadim,
>>>
>>> Results are a bit confusing. Any idea why it's better on long string,
>>> but worse on a short string? If that's actually the case, there is no any
>>> reason to make the change and I would just close the ticket.
>>>
>>> -Val
>>>
>>> On Thu, Mar 9, 2017 at 9:20 AM, Вадим Опольский <va...@gmail.com>
>>> wrote:
>>>
>>>> Hello everyone!
>>>>
>>>> Colleagues, take a look please at the results of measuring.
>>>>
>>>> Can I close this ticket ?
>>>>
>>>> Should I add JMH benchmark and unit test to Ignite project ?
>>>>
>>>> Results of measuring
>>>> https://github.com/javaller/mybenchmark/blob/master/out.txt
>>>>
>>>> Benchmark
>>>> https://github.com/javaller/mybenchmark/blob/master/src/main
>>>> /java/org/sample/ExampleTest.java
>>>>
>>>> UTest
>>>> https://github.com/javaller/mybenchmark/blob/master/src/main
>>>> /java/org/sample/BinaryMarshallerSelfTest.java
>>>>
>>>> *results of measuring*
>>>> Benchmark
>>>> (message)                                              Mode  Cnt
>>>> Score   Error  Units
>>>> LatchBenchmark.binaryHeapOutputStreamDirect
>>>> TestTestTestTestTestTestTestTestTest  avgt   50  128,036 ± 4,360  ns/op
>>>> LatchBenchmark.binaryHeapOutputStreamDirect
>>>> TestTestTest                    avgt   50    44,934 ± 1,463  ns/op
>>>> LatchBenchmark.binaryHeapOutputStreamDirect
>>>> Test                          avgt   50    21,254 ± 0,776  ns/op
>>>> LatchBenchmark.binaryHeapOutputStreamInDirect
>>>> TestTestTestTestTestTestTestTestTest avgt   50    83,262 ± 2,264  ns/op
>>>> LatchBenchmark.binaryHeapOutputStreamInDirect
>>>> TestTestTest                   avgt   50    58,975 ± 1,559  ns/op
>>>> LatchBenchmark.binaryHeapOutputStreamInDirect
>>>> Test                         avgt   50    48,506 ± 1,116  ns/op
>>>>
>>>>
>>>> Vadim
>>>>
>>>> 2017-03-06 19:42 GMT+03:00 Вадим Опольский <va...@gmail.com>:
>>>>
>>>>> Hello, everybody!
>>>>>
>>>>> Valentin, I've corrected benchmark and received the results:
>>>>>
>>>>> Benchmark
>>>>> (message)                                              Mode  Cnt
>>>>> Score   Error  Units
>>>>> LatchBenchmark.binaryHeapOutputStreamDirect
>>>>> TestTestTestTestTestTestTestTestTest  avgt   50  128,036 ± 4,360
>>>>> ns/op
>>>>> LatchBenchmark.binaryHeapOutputStreamDirect
>>>>> TestTestTest                    avgt   50    44,934 ± 1,463  ns/op
>>>>> LatchBenchmark.binaryHeapOutputStreamDirect
>>>>> Test                          avgt   50    21,254 ± 0,776  ns/op
>>>>> LatchBenchmark.binaryHeapOutputStreamInDirect
>>>>> TestTestTestTestTestTestTestTestTest avgt   50    83,262 ± 2,264
>>>>> ns/op
>>>>> LatchBenchmark.binaryHeapOutputStreamInDirect
>>>>> TestTestTest                   avgt   50    58,975 ± 1,559  ns/op
>>>>> LatchBenchmark.binaryHeapOutputStreamInDirect
>>>>> Test                         avgt   50    48,506 ± 1,116  ns/op
>>>>>
>>>>> https://github.com/javaller/MyBenchmark/blob/master/out_06_03_17_2.txt
>>>>>
>>>>> Whats the next step ?
>>>>>
>>>>>  Do I have to add benchmark to Ignite project ?
>>>>>
>>>>> Vadim Opolskiy
>>>>>
>>>>> 2017-03-03 21:11 GMT+03:00 Valentin Kulichenko <
>>>>> valentin.kulichenko@gmail.com>:
>>>>>
>>>>>> Hi Vadim,
>>>>>>
>>>>>> What do you mean by "copied benchmarks"? What changed singe previous
>>>>>> iteration and why results are so different?
>>>>>>
>>>>>> As for duplicated loop, you don't need it. BinaryOutputStream allows
>>>>>> to write a value to a particular position (even before already written
>>>>>> data). So you can reserve 4 bytes for length, remember position, calculate
>>>>>> length while encoding and writing bytes, and then write length.
>>>>>>
>>>>>> -Val
>>>>>>
>>>>>> On Fri, Mar 3, 2017 at 12:45 AM, Вадим Опольский <
>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>
>>>>>>> Valentin,
>>>>>>>
>>>>>>> What do you think about duplicated cycle in strToBinaryOutputStream
>>>>>>> ?
>>>>>>>
>>>>>>> How to calculate StrLen для outBinaryHeap without this cycle ?
>>>>>>>
>>>>>>> public class BinaryUtilsNew extends BinaryUtils {
>>>>>>>
>>>>>>>     public static int getStrLen(String val) {
>>>>>>>         int strLen = val.length();
>>>>>>>         int utfLen = 0;
>>>>>>>         int c;
>>>>>>>
>>>>>>>         // Determine length of resulting byte array.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *for (int cnt = 0; cnt < strLen; cnt++) {            c = val.charAt(cnt);            if (c >= 0x0001 && c <= 0x007F)*                utfLen++;
>>>>>>>        *     else if (c > 0x07FF)*
>>>>>>>                 utfLen += 3;
>>>>>>>             else
>>>>>>>                 utfLen += 2;
>>>>>>>         }
>>>>>>>
>>>>>>>         return utfLen;
>>>>>>>     }
>>>>>>>
>>>>>>>     public static void strToUtf8BytesDirect(BinaryOutputStream outBinaryHeap, String val) {
>>>>>>>
>>>>>>>         int strLen = val.length();
>>>>>>>         int c, cnt;
>>>>>>>
>>>>>>>         int position = 0;
>>>>>>>
>>>>>>>         outBinaryHeap.unsafeEnsure(1 + 4);
>>>>>>>
>>>>>>> *   outBinaryHeap.unsafeWriteByte(GridBinaryMarshaller.STRING);        outBinaryHeap.unsafeWriteInt(getStrLen(val));*
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> * for (cnt = 0; cnt < strLen; cnt++) {            c = val.charAt(cnt);*
>>>>>>>        *     if (c >= 0x0001 && c <= 0x007F)*
>>>>>>>                 outBinaryHeap.writeByte((byte) c);
>>>>>>>          *   else if (c > 0x07FF) {*
>>>>>>>                 outBinaryHeap.writeByte((byte)(0xE0 | (c >> 12) & 0x0F));
>>>>>>>                 outBinaryHeap.writeByte((byte)(0x80 | (c >> 6) & 0x3F));
>>>>>>>                 outBinaryHeap.writeByte((byte)(0x80 | (c & 0x3F)));
>>>>>>>             }
>>>>>>>             else {
>>>>>>>                 outBinaryHeap.writeByte((byte)(0xC0 | ((c >> 6) & 0x1F)));
>>>>>>>                 outBinaryHeap.writeByte((byte)(0x80 | (c  & 0x3F)));
>>>>>>>             }
>>>>>>>         }
>>>>>>>     }
>>>>>>>
>>>>>>>
>>>>>>> Vadim
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2017-03-03 2:00 GMT+03:00 Valentin Kulichenko <
>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>
>>>>>>>> Vadim,
>>>>>>>>
>>>>>>>> Looks better now. Can you also try to modify the benchmark so that
>>>>>>>> marshaller and writer are created outside of the measured method? I.e. the
>>>>>>>> benchmark methods should be as simple as this:
>>>>>>>>
>>>>>>>>     @Benchmark
>>>>>>>>     public void binaryHeapOutputStreamDirect() throws Exception {
>>>>>>>>         writer.doWriteStringDirect(message);
>>>>>>>>     }
>>>>>>>>
>>>>>>>>     @Benchmark
>>>>>>>>     public void binaryHeapOutputStreamInDirect() throws Exception {
>>>>>>>>         writer.doWriteString(message);
>>>>>>>>     }
>>>>>>>>
>>>>>>>> In any case, do I understand correctly that it didn't actually make
>>>>>>>> any performance difference? If so, I think we can close the ticket.
>>>>>>>>
>>>>>>>> Vova, can you also take a look and provide your thoughts?
>>>>>>>>
>>>>>>>> -Val
>>>>>>>>
>>>>>>>> On Thu, Mar 2, 2017 at 1:27 PM, Вадим Опольский <
>>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Valentin!
>>>>>>>>>
>>>>>>>>> I've created:
>>>>>>>>>
>>>>>>>>> new method strToUtf8BytesDirect in BinaryUtilsNew
>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>> /java/org/sample/BinaryUtilsNew.java
>>>>>>>>>
>>>>>>>>> new method doWriteStringDirect in BinaryWriterExImplNew
>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>> /java/org/sample/BinaryWriterExImplNew.java
>>>>>>>>>
>>>>>>>>> benchmarks for BinaryWriterExImpl doWriteString and
>>>>>>>>> BinaryWriterExImplNew  doWriteStringDirect
>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>> /java/org/sample/ExampleTest.java
>>>>>>>>>
>>>>>>>>> This is a result of comparing:
>>>>>>>>>
>>>>>>>>> Benchmark
>>>>>>>>> Mode  Cnt   Score               Error         UnitsExampleTest.binaryHeapOutputStreamDirect
>>>>>>>>> avgt   50  1128448,743 ± 13536,689  ns/opExampleTest.binaryHeapOutputStreamInDirect
>>>>>>>>> avgt   50  1127270,695 ± 17309,256  ns/op
>>>>>>>>>
>>>>>>>>> Vadim
>>>>>>>>>
>>>>>>>>> 2017-03-02 1:02 GMT+03:00 Valentin Kulichenko <
>>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>>
>>>>>>>>>> Hi Vadim,
>>>>>>>>>>
>>>>>>>>>> We're getting closer :) I would actually like to see the test for
>>>>>>>>>> actual implementation of BinaryWriterExImpl#doWriteString
>>>>>>>>>> method. Logic in binaryHeapOutputInDirect() confuses me a bit and I'm not
>>>>>>>>>> sure comparison is valid.
>>>>>>>>>>
>>>>>>>>>> Can you please do the following:
>>>>>>>>>>
>>>>>>>>>> 1. Create new BinaryUtils#strToUtf8BytesDirect method,
>>>>>>>>>> copy-paste the code from existing BinaryUtils#strToUtf8Bytes and modify it
>>>>>>>>>> so that it takes BinaryOutputStream as an argument and writes to it
>>>>>>>>>> directly. Do not create stream inside this method, as it's the same as
>>>>>>>>>> creating new array.
>>>>>>>>>> 2. Create new BinaryWriterExImpl#doWriteStringDirect, copy-paste
>>>>>>>>>> the code from existing BinaryWriterExImpl#doWriteString and
>>>>>>>>>> modify it so that it uses BinaryUtils#strToUtf8BytesDirect and
>>>>>>>>>> doesn't call out.writeByteArray.
>>>>>>>>>> 3. Create benchmark for BinaryWriterExImpl#doWriteString method.
>>>>>>>>>> I.e., create an instance of BinaryWriterExImpl and call doWriteString() in
>>>>>>>>>> benchmark method.
>>>>>>>>>> 4. Similarly, create benchmark for BinaryWriterExImpl#doWriteStri
>>>>>>>>>> ngDirect.
>>>>>>>>>> 5. Compare results.
>>>>>>>>>>
>>>>>>>>>> This will give us clear picture of how these two approaches
>>>>>>>>>> perform. Your current results are actually promising, but I would like to
>>>>>>>>>> confirm them.
>>>>>>>>>>
>>>>>>>>>> -Val
>>>>>>>>>>
>>>>>>>>>> On Wed, Mar 1, 2017 at 6:17 AM, Вадим Опольский <
>>>>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Valentin!
>>>>>>>>>>>
>>>>>>>>>>> Thank you for comments.
>>>>>>>>>>>
>>>>>>>>>>> There is a new method which writes directly
>>>>>>>>>>> to BinaryOutputStream instead of intermediate array.
>>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>>>> /java/org/sample/BinaryUtilsNew.java
>>>>>>>>>>>
>>>>>>>>>>> There is benchmark.
>>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>>>> /java/org/sample/MyBenchmark.java
>>>>>>>>>>>
>>>>>>>>>>> Unit test
>>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>>>> /java/org/sample/BinaryOutputStreamTest.java
>>>>>>>>>>>
>>>>>>>>>>> Statistics
>>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/out_01_0
>>>>>>>>>>> 3_17.txt
>>>>>>>>>>>
>>>>>>>>>>> Benchmark
>>>>>>>>>>>  Mode       Cnt    Score        Error  Units MyBenchmark.binaryHeapOutputIn
>>>>>>>>>>> Direct            avgt          50  111,337 ± 0,742  ns/op
>>>>>>>>>>> MyBenchmark.binaryHeapOutputStreamDirect   avgt          50
>>>>>>>>>>> 23,847 ± 0,303    ns/op
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Vadim
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2017-02-28 4:29 GMT+03:00 Valentin Kulichenko <
>>>>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Vadim,
>>>>>>>>>>>>
>>>>>>>>>>>> Looks like you accidentally removed dev list from the thread,
>>>>>>>>>>>> adding it back.
>>>>>>>>>>>>
>>>>>>>>>>>> I think there is still misunderstanding. What I propose is to
>>>>>>>>>>>> modify the BinaryUtils#strToUtf8Bytes so that it writes directly to BinaryOutputStream
>>>>>>>>>>>> instead of intermediate array. This should decrease memory consumption and
>>>>>>>>>>>> can also increase performance as we will avoid 'writeByteArray'
>>>>>>>>>>>> step at the end.
>>>>>>>>>>>>
>>>>>>>>>>>> Does it make sense to you?
>>>>>>>>>>>>
>>>>>>>>>>>> -Val
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Feb 27, 2017 at 6:55 AM, Вадим Опольский <
>>>>>>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi, Valentin!
>>>>>>>>>>>>>
>>>>>>>>>>>>> What do you think about using the methods of
>>>>>>>>>>>>> BinaryOutputStream:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1) writeByteArray(byte[] val)
>>>>>>>>>>>>> 2) writeCharArray(char[] val)
>>>>>>>>>>>>> 3) write (byte[] arr, int off, int len)
>>>>>>>>>>>>>
>>>>>>>>>>>>> String val = "Test";
>>>>>>>>>>>>>     out.writeByteArray( val.getBytes(UTF_8));
>>>>>>>>>>>>>
>>>>>>>>>>>>>  String val = "Test";
>>>>>>>>>>>>>     out.writeCharArray(str.toCharArray());
>>>>>>>>>>>>>
>>>>>>>>>>>>> String val = "Test"
>>>>>>>>>>>>> InputStream stream = new ByteArrayInputStream(
>>>>>>>>>>>>> exampleString.getBytes(StandartCharsets.UTF_8));
>>>>>>>>>>>>> byte[] buffer = new byte[1024];
>>>>>>>>>>>>> while ((buffer = stream.read()) != -1) {
>>>>>>>>>>>>> out.writeByteArray(buffer);
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>> What else can we use ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Vadim
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2017-02-25 2:21 GMT+03:00 Valentin Kulichenko <
>>>>>>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Vadim,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Which method implements the approach described in the ticket?
>>>>>>>>>>>>>> From what I see, all writeToStringX versions are still encoding into an
>>>>>>>>>>>>>> intermediate array and then call out.writeByteArray. What we need to test
>>>>>>>>>>>>>> is the approach where bytes are written directly into the stream during
>>>>>>>>>>>>>> encoding. Encoding algorithm itself should stay the same for now, otherwise
>>>>>>>>>>>>>> we will not know how to interpret the result.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It looks like there is some misunderstanding here, so please
>>>>>>>>>>>>>> let me know anything is still unclear. I will be happy to answer your
>>>>>>>>>>>>>> questions.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Feb 22, 2017 at 7:22 PM, Valentin Kulichenko <
>>>>>>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Vadim,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks, I will review this week.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Feb 22, 2017 at 2:28 AM, Вадим Опольский <
>>>>>>>>>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Valentin!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I created BinaryWriterExImplNew (extended of
>>>>>>>>>>>>>>>> BinaryWriterExImpl) and added new methods with changes
>>>>>>>>>>>>>>>> described in the ticket
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> https://github.com/javaller/My
>>>>>>>>>>>>>>>> Benchmark/blob/master/src/main
>>>>>>>>>>>>>>>> /java/org/sample/BinaryWriterExImplNew.java
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I created a benchmark for BinaryWriterExImplNew
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> https://github.com/javaller/My
>>>>>>>>>>>>>>>> Benchmark/blob/master/src/main
>>>>>>>>>>>>>>>> /java/org/sample/ExampleTest.java
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I run benchmark and compared results
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> https://github.com/javaller/My
>>>>>>>>>>>>>>>> Benchmark/blob/master/totalstat.txt
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> # Run complete. Total time: 00:10:24
>>>>>>>>>>>>>>>> Benchmark                                    Mode
>>>>>>>>>>>>>>>> Cnt        Score       Error  Units
>>>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream1          avgt   50
>>>>>>>>>>>>>>>> 1114999,207 ± 16756,776  ns/op
>>>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream2          avgt   50
>>>>>>>>>>>>>>>> 1118149,320 ± 17515,961  ns/op
>>>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream3          avgt   50
>>>>>>>>>>>>>>>> 1113678,657 ± 17652,314  ns/op
>>>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream4          avgt   50
>>>>>>>>>>>>>>>> 1112415,051 ± 18273,874  ns/op
>>>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream5          avgt   50
>>>>>>>>>>>>>>>> 1111366,583 ± 18282,829  ns/op
>>>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStreamACSII   avgt   50
>>>>>>>>>>>>>>>> 1112079,667 ± 16659,532  ns/op
>>>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStreamUTFCustom  avgt   50
>>>>>>>>>>>>>>>> 1114949,759 ± 16809,669  ns/op
>>>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStreamUTFNIO        avgt   50
>>>>>>>>>>>>>>>> 1121462,325 ± 19836,466  ns/op
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Is it OK? Whats the next step? Do I have to move this
>>>>>>>>>>>>>>>> JMH benchmark to the Ignite project ?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Vadim Opolski
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko <
>>>>>>>>>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Vadim,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm not sure I understand your benchmarks and how they
>>>>>>>>>>>>>>>>> verify the optimization discussed here. Basically, here is what needs to be
>>>>>>>>>>>>>>>>> done:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 1. Create a benchmark for BinaryWriterExImpl#doWriteString
>>>>>>>>>>>>>>>>> method.
>>>>>>>>>>>>>>>>> 2. Run the benchmark with current implementation.
>>>>>>>>>>>>>>>>> 3. Make the change described in the ticket.
>>>>>>>>>>>>>>>>> 4. Run the benchmark with these changes.
>>>>>>>>>>>>>>>>> 5. Compare results.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Makes sense? Let me know if anything is unclear.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский <
>>>>>>>>>>>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hello everybody!
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Valentin, I just have finished benchmark (with JMH) -
>>>>>>>>>>>>>>>>>> https://github.com/javaller/MyBenchmark.git
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> It collect data about time working of serialization.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> For instance - https://github.com/javaller/My
>>>>>>>>>>>>>>>>>> Benchmark/blob/master/out200217.txt
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> To start it you have to do next:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 1) clone it - git colne https://github.com/javal
>>>>>>>>>>>>>>>>>> ler/MyBenchmark.git
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 2) install it - mvn install
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 3) run benchmarks -  java -Xms1024m -Xmx4096m -jar
>>>>>>>>>>>>>>>>>> target\benchmarks.jar
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Vadim Opolski
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko <
>>>>>>>>>>>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Vladimir,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I think we misunderstood each other. My understanding of
>>>>>>>>>>>>>>>>>>> this optimization is the following.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Currently string serialization is done in two steps (see
>>>>>>>>>>>>>>>>>>> BinaryWriterExImpl#doWriteString):
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode
>>>>>>>>>>>>>>>>>>> string into byte array.
>>>>>>>>>>>>>>>>>>> out.writeByteArray(strArr);                      //
>>>>>>>>>>>>>>>>>>> Write byte array into stream.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> What this ticket suggests is to write directly into
>>>>>>>>>>>>>>>>>>> stream while string is encoded, without intermediate array. This both
>>>>>>>>>>>>>>>>>>> reduces memory consumption and eliminates array copy step.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I updated the ticket and added this explanation there.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Vadim, can you create a micro benchmark and check if it
>>>>>>>>>>>>>>>>>>> gives any improvement?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov <
>>>>>>>>>>>>>>>>>>> vozerov@gridgain.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> It is hard to say whether it makes sense or not. No
>>>>>>>>>>>>>>>>>>>> doubt, it could speed up marshalling process at the cost of 2x memory
>>>>>>>>>>>>>>>>>>>> required for strings. From my previous experience with marshalling
>>>>>>>>>>>>>>>>>>>> micro-optimizations, we will hardly ever notice speedup in distributed
>>>>>>>>>>>>>>>>>>>> environment.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> But, there is another sied - it could speedup our
>>>>>>>>>>>>>>>>>>>> queries, because we will not have to unmarshal string on every field
>>>>>>>>>>>>>>>>>>>> access. So I would try to make this optimization optional and then measure
>>>>>>>>>>>>>>>>>>>> query performance with classes having lots of strings. It could give us
>>>>>>>>>>>>>>>>>>>> interesting results.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko <
>>>>>>>>>>>>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Vladimir,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Can you please take a look and provide your thoughts?
>>>>>>>>>>>>>>>>>>>>> Can this be applied to binary marshaller? From what I recall, it serializes
>>>>>>>>>>>>>>>>>>>>> string a bit differently from optimized marshaller, so I'm not sure.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan <
>>>>>>>>>>>>>>>>>>>>> dsetrakyan@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko <
>>>>>>>>>>>>>>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> > Hi Vadim,
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> > I don't think it makes much sense to invest into
>>>>>>>>>>>>>>>>>>>>>> OptimizedMarshaller.
>>>>>>>>>>>>>>>>>>>>>> > However, I would check if this optimization is
>>>>>>>>>>>>>>>>>>>>>> applicable to
>>>>>>>>>>>>>>>>>>>>>> > BinaryMarshaller, and if yes, implement it.
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Val, in this case can you please update the ticket?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> > -Val
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский <
>>>>>>>>>>>>>>>>>>>>>> vaopolskij@gmail.com>
>>>>>>>>>>>>>>>>>>>>>> > wrote:
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> > > Dear sirs!
>>>>>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>>>>>> > > I want to resolve issue IGNITE-13 -
>>>>>>>>>>>>>>>>>>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>>>>>> > > Is it actual?
>>>>>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>>>>>> > > Vadim Opolski
>>>>>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: IGNITE-13 (ready for review)

Posted by Valentin Kulichenko <va...@gmail.com>.
Hi Vadim,

According to these results, I don't think it makes sense to make this
change in the product. I closed the ticket.

-Val

On Fri, Mar 10, 2017 at 12:28 PM, Вадим Опольский <va...@gmail.com>
wrote:

> Hi Valentin,
>
> I cant find out why it's better on long string, but worse on a short
> string. May be it needs to use special tools like such as oracle solaris
> studio performance analyzer or other.
>
> I've added links to the benchmark, unit test and results to the ticket and
> switched status to patch available.
>
> Vadim Opolski
>
>
>
> 2017-03-09 13:57 GMT+03:00 Valentin Kulichenko <
> valentin.kulichenko@gmail.com>:
>
>> Hi Vadim,
>>
>> Results are a bit confusing. Any idea why it's better on long string, but
>> worse on a short string? If that's actually the case, there is no any
>> reason to make the change and I would just close the ticket.
>>
>> -Val
>>
>> On Thu, Mar 9, 2017 at 9:20 AM, Вадим Опольский <va...@gmail.com>
>> wrote:
>>
>>> Hello everyone!
>>>
>>> Colleagues, take a look please at the results of measuring.
>>>
>>> Can I close this ticket ?
>>>
>>> Should I add JMH benchmark and unit test to Ignite project ?
>>>
>>> Results of measuring
>>> https://github.com/javaller/mybenchmark/blob/master/out.txt
>>>
>>> Benchmark
>>> https://github.com/javaller/mybenchmark/blob/master/src/main
>>> /java/org/sample/ExampleTest.java
>>>
>>> UTest
>>> https://github.com/javaller/mybenchmark/blob/master/src/main
>>> /java/org/sample/BinaryMarshallerSelfTest.java
>>>
>>> *results of measuring*
>>> Benchmark
>>> (message)                                              Mode  Cnt
>>> Score   Error  Units
>>> LatchBenchmark.binaryHeapOutputStreamDirect
>>> TestTestTestTestTestTestTestTestTest  avgt   50  128,036 ± 4,360  ns/op
>>> LatchBenchmark.binaryHeapOutputStreamDirect
>>> TestTestTest                    avgt   50    44,934 ± 1,463  ns/op
>>> LatchBenchmark.binaryHeapOutputStreamDirect
>>> Test                          avgt   50    21,254 ± 0,776  ns/op
>>> LatchBenchmark.binaryHeapOutputStreamInDirect
>>> TestTestTestTestTestTestTestTestTest avgt   50    83,262 ± 2,264  ns/op
>>> LatchBenchmark.binaryHeapOutputStreamInDirect
>>> TestTestTest                   avgt   50    58,975 ± 1,559  ns/op
>>> LatchBenchmark.binaryHeapOutputStreamInDirect
>>> Test                         avgt   50    48,506 ± 1,116  ns/op
>>>
>>>
>>> Vadim
>>>
>>> 2017-03-06 19:42 GMT+03:00 Вадим Опольский <va...@gmail.com>:
>>>
>>>> Hello, everybody!
>>>>
>>>> Valentin, I've corrected benchmark and received the results:
>>>>
>>>> Benchmark
>>>> (message)                                              Mode  Cnt
>>>> Score   Error  Units
>>>> LatchBenchmark.binaryHeapOutputStreamDirect
>>>> TestTestTestTestTestTestTestTestTest  avgt   50  128,036 ± 4,360  ns/op
>>>> LatchBenchmark.binaryHeapOutputStreamDirect
>>>> TestTestTest                    avgt   50    44,934 ± 1,463  ns/op
>>>> LatchBenchmark.binaryHeapOutputStreamDirect
>>>> Test                          avgt   50    21,254 ± 0,776  ns/op
>>>> LatchBenchmark.binaryHeapOutputStreamInDirect
>>>> TestTestTestTestTestTestTestTestTest avgt   50    83,262 ± 2,264  ns/op
>>>> LatchBenchmark.binaryHeapOutputStreamInDirect
>>>> TestTestTest                   avgt   50    58,975 ± 1,559  ns/op
>>>> LatchBenchmark.binaryHeapOutputStreamInDirect
>>>> Test                         avgt   50    48,506 ± 1,116  ns/op
>>>>
>>>> https://github.com/javaller/MyBenchmark/blob/master/out_06_03_17_2.txt
>>>>
>>>> Whats the next step ?
>>>>
>>>>  Do I have to add benchmark to Ignite project ?
>>>>
>>>> Vadim Opolskiy
>>>>
>>>> 2017-03-03 21:11 GMT+03:00 Valentin Kulichenko <
>>>> valentin.kulichenko@gmail.com>:
>>>>
>>>>> Hi Vadim,
>>>>>
>>>>> What do you mean by "copied benchmarks"? What changed singe previous
>>>>> iteration and why results are so different?
>>>>>
>>>>> As for duplicated loop, you don't need it. BinaryOutputStream allows
>>>>> to write a value to a particular position (even before already written
>>>>> data). So you can reserve 4 bytes for length, remember position, calculate
>>>>> length while encoding and writing bytes, and then write length.
>>>>>
>>>>> -Val
>>>>>
>>>>> On Fri, Mar 3, 2017 at 12:45 AM, Вадим Опольский <vaopolskij@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> Valentin,
>>>>>>
>>>>>> What do you think about duplicated cycle in strToBinaryOutputStream ?
>>>>>>
>>>>>> How to calculate StrLen для outBinaryHeap without this cycle ?
>>>>>>
>>>>>> public class BinaryUtilsNew extends BinaryUtils {
>>>>>>
>>>>>>     public static int getStrLen(String val) {
>>>>>>         int strLen = val.length();
>>>>>>         int utfLen = 0;
>>>>>>         int c;
>>>>>>
>>>>>>         // Determine length of resulting byte array.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *for (int cnt = 0; cnt < strLen; cnt++) {            c = val.charAt(cnt);            if (c >= 0x0001 && c <= 0x007F)*                utfLen++;
>>>>>>        *     else if (c > 0x07FF)*
>>>>>>                 utfLen += 3;
>>>>>>             else
>>>>>>                 utfLen += 2;
>>>>>>         }
>>>>>>
>>>>>>         return utfLen;
>>>>>>     }
>>>>>>
>>>>>>     public static void strToUtf8BytesDirect(BinaryOutputStream outBinaryHeap, String val) {
>>>>>>
>>>>>>         int strLen = val.length();
>>>>>>         int c, cnt;
>>>>>>
>>>>>>         int position = 0;
>>>>>>
>>>>>>         outBinaryHeap.unsafeEnsure(1 + 4);
>>>>>>
>>>>>> *   outBinaryHeap.unsafeWriteByte(GridBinaryMarshaller.STRING);        outBinaryHeap.unsafeWriteInt(getStrLen(val));*
>>>>>>
>>>>>>
>>>>>>
>>>>>> * for (cnt = 0; cnt < strLen; cnt++) {            c = val.charAt(cnt);*
>>>>>>        *     if (c >= 0x0001 && c <= 0x007F)*
>>>>>>                 outBinaryHeap.writeByte((byte) c);
>>>>>>          *   else if (c > 0x07FF) {*
>>>>>>                 outBinaryHeap.writeByte((byte)(0xE0 | (c >> 12) & 0x0F));
>>>>>>                 outBinaryHeap.writeByte((byte)(0x80 | (c >> 6) & 0x3F));
>>>>>>                 outBinaryHeap.writeByte((byte)(0x80 | (c & 0x3F)));
>>>>>>             }
>>>>>>             else {
>>>>>>                 outBinaryHeap.writeByte((byte)(0xC0 | ((c >> 6) & 0x1F)));
>>>>>>                 outBinaryHeap.writeByte((byte)(0x80 | (c  & 0x3F)));
>>>>>>             }
>>>>>>         }
>>>>>>     }
>>>>>>
>>>>>>
>>>>>> Vadim
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2017-03-03 2:00 GMT+03:00 Valentin Kulichenko <
>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>
>>>>>>> Vadim,
>>>>>>>
>>>>>>> Looks better now. Can you also try to modify the benchmark so that
>>>>>>> marshaller and writer are created outside of the measured method? I.e. the
>>>>>>> benchmark methods should be as simple as this:
>>>>>>>
>>>>>>>     @Benchmark
>>>>>>>     public void binaryHeapOutputStreamDirect() throws Exception {
>>>>>>>         writer.doWriteStringDirect(message);
>>>>>>>     }
>>>>>>>
>>>>>>>     @Benchmark
>>>>>>>     public void binaryHeapOutputStreamInDirect() throws Exception {
>>>>>>>         writer.doWriteString(message);
>>>>>>>     }
>>>>>>>
>>>>>>> In any case, do I understand correctly that it didn't actually make
>>>>>>> any performance difference? If so, I think we can close the ticket.
>>>>>>>
>>>>>>> Vova, can you also take a look and provide your thoughts?
>>>>>>>
>>>>>>> -Val
>>>>>>>
>>>>>>> On Thu, Mar 2, 2017 at 1:27 PM, Вадим Опольский <
>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Valentin!
>>>>>>>>
>>>>>>>> I've created:
>>>>>>>>
>>>>>>>> new method strToUtf8BytesDirect in BinaryUtilsNew
>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>> /java/org/sample/BinaryUtilsNew.java
>>>>>>>>
>>>>>>>> new method doWriteStringDirect in BinaryWriterExImplNew
>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>> /java/org/sample/BinaryWriterExImplNew.java
>>>>>>>>
>>>>>>>> benchmarks for BinaryWriterExImpl doWriteString and
>>>>>>>> BinaryWriterExImplNew  doWriteStringDirect
>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>> /java/org/sample/ExampleTest.java
>>>>>>>>
>>>>>>>> This is a result of comparing:
>>>>>>>>
>>>>>>>> Benchmark
>>>>>>>> Mode  Cnt   Score               Error         UnitsExampleTest.binaryHeapOutputStreamDirect
>>>>>>>> avgt   50  1128448,743 ± 13536,689  ns/opExampleTest.binaryHeapOutputStreamInDirect
>>>>>>>> avgt   50  1127270,695 ± 17309,256  ns/op
>>>>>>>>
>>>>>>>> Vadim
>>>>>>>>
>>>>>>>> 2017-03-02 1:02 GMT+03:00 Valentin Kulichenko <
>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>
>>>>>>>>> Hi Vadim,
>>>>>>>>>
>>>>>>>>> We're getting closer :) I would actually like to see the test for
>>>>>>>>> actual implementation of BinaryWriterExImpl#doWriteString method.
>>>>>>>>> Logic in binaryHeapOutputInDirect() confuses me a bit and I'm not sure
>>>>>>>>> comparison is valid.
>>>>>>>>>
>>>>>>>>> Can you please do the following:
>>>>>>>>>
>>>>>>>>> 1. Create new BinaryUtils#strToUtf8BytesDirect method, copy-paste
>>>>>>>>> the code from existing BinaryUtils#strToUtf8Bytes and modify it so that it
>>>>>>>>> takes BinaryOutputStream as an argument and writes to it directly. Do not
>>>>>>>>> create stream inside this method, as it's the same as creating new array.
>>>>>>>>> 2. Create new BinaryWriterExImpl#doWriteStringDirect, copy-paste
>>>>>>>>> the code from existing BinaryWriterExImpl#doWriteString and
>>>>>>>>> modify it so that it uses BinaryUtils#strToUtf8BytesDirect and
>>>>>>>>> doesn't call out.writeByteArray.
>>>>>>>>> 3. Create benchmark for BinaryWriterExImpl#doWriteString method.
>>>>>>>>> I.e., create an instance of BinaryWriterExImpl and call doWriteString() in
>>>>>>>>> benchmark method.
>>>>>>>>> 4. Similarly, create benchmark for BinaryWriterExImpl#doWriteStri
>>>>>>>>> ngDirect.
>>>>>>>>> 5. Compare results.
>>>>>>>>>
>>>>>>>>> This will give us clear picture of how these two approaches
>>>>>>>>> perform. Your current results are actually promising, but I would like to
>>>>>>>>> confirm them.
>>>>>>>>>
>>>>>>>>> -Val
>>>>>>>>>
>>>>>>>>> On Wed, Mar 1, 2017 at 6:17 AM, Вадим Опольский <
>>>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Valentin!
>>>>>>>>>>
>>>>>>>>>> Thank you for comments.
>>>>>>>>>>
>>>>>>>>>> There is a new method which writes directly to BinaryOutputStream
>>>>>>>>>> instead of intermediate array.
>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>>> /java/org/sample/BinaryUtilsNew.java
>>>>>>>>>>
>>>>>>>>>> There is benchmark.
>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>>> /java/org/sample/MyBenchmark.java
>>>>>>>>>>
>>>>>>>>>> Unit test
>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>>> /java/org/sample/BinaryOutputStreamTest.java
>>>>>>>>>>
>>>>>>>>>> Statistics
>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/out_01_0
>>>>>>>>>> 3_17.txt
>>>>>>>>>>
>>>>>>>>>> Benchmark
>>>>>>>>>>  Mode       Cnt    Score        Error  Units MyBenchmark.binaryHeapOutputIn
>>>>>>>>>> Direct            avgt          50  111,337 ± 0,742  ns/op
>>>>>>>>>> MyBenchmark.binaryHeapOutputStreamDirect   avgt          50
>>>>>>>>>> 23,847 ± 0,303    ns/op
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Vadim
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2017-02-28 4:29 GMT+03:00 Valentin Kulichenko <
>>>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>>>
>>>>>>>>>>> Hi Vadim,
>>>>>>>>>>>
>>>>>>>>>>> Looks like you accidentally removed dev list from the thread,
>>>>>>>>>>> adding it back.
>>>>>>>>>>>
>>>>>>>>>>> I think there is still misunderstanding. What I propose is to
>>>>>>>>>>> modify the BinaryUtils#strToUtf8Bytes so that it writes directly to BinaryOutputStream
>>>>>>>>>>> instead of intermediate array. This should decrease memory consumption and
>>>>>>>>>>> can also increase performance as we will avoid 'writeByteArray'
>>>>>>>>>>> step at the end.
>>>>>>>>>>>
>>>>>>>>>>> Does it make sense to you?
>>>>>>>>>>>
>>>>>>>>>>> -Val
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Feb 27, 2017 at 6:55 AM, Вадим Опольский <
>>>>>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi, Valentin!
>>>>>>>>>>>>
>>>>>>>>>>>> What do you think about using the methods of BinaryOutputStream:
>>>>>>>>>>>>
>>>>>>>>>>>> 1) writeByteArray(byte[] val)
>>>>>>>>>>>> 2) writeCharArray(char[] val)
>>>>>>>>>>>> 3) write (byte[] arr, int off, int len)
>>>>>>>>>>>>
>>>>>>>>>>>> String val = "Test";
>>>>>>>>>>>>     out.writeByteArray( val.getBytes(UTF_8));
>>>>>>>>>>>>
>>>>>>>>>>>>  String val = "Test";
>>>>>>>>>>>>     out.writeCharArray(str.toCharArray());
>>>>>>>>>>>>
>>>>>>>>>>>> String val = "Test"
>>>>>>>>>>>> InputStream stream = new ByteArrayInputStream(
>>>>>>>>>>>> exampleString.getBytes(StandartCharsets.UTF_8));
>>>>>>>>>>>> byte[] buffer = new byte[1024];
>>>>>>>>>>>> while ((buffer = stream.read()) != -1) {
>>>>>>>>>>>> out.writeByteArray(buffer);
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> What else can we use ?
>>>>>>>>>>>>
>>>>>>>>>>>> Vadim
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 2017-02-25 2:21 GMT+03:00 Valentin Kulichenko <
>>>>>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Vadim,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Which method implements the approach described in the ticket?
>>>>>>>>>>>>> From what I see, all writeToStringX versions are still encoding into an
>>>>>>>>>>>>> intermediate array and then call out.writeByteArray. What we need to test
>>>>>>>>>>>>> is the approach where bytes are written directly into the stream during
>>>>>>>>>>>>> encoding. Encoding algorithm itself should stay the same for now, otherwise
>>>>>>>>>>>>> we will not know how to interpret the result.
>>>>>>>>>>>>>
>>>>>>>>>>>>> It looks like there is some misunderstanding here, so please
>>>>>>>>>>>>> let me know anything is still unclear. I will be happy to answer your
>>>>>>>>>>>>> questions.
>>>>>>>>>>>>>
>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Feb 22, 2017 at 7:22 PM, Valentin Kulichenko <
>>>>>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Vadim,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks, I will review this week.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Feb 22, 2017 at 2:28 AM, Вадим Опольский <
>>>>>>>>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Valentin!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I created BinaryWriterExImplNew (extended of
>>>>>>>>>>>>>>> BinaryWriterExImpl) and added new methods with changes
>>>>>>>>>>>>>>> described in the ticket
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>>>>>>>> /java/org/sample/BinaryWriterExImplNew.java
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I created a benchmark for BinaryWriterExImplNew
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>>>>>>>> /java/org/sample/ExampleTest.java
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I run benchmark and compared results
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/totalsta
>>>>>>>>>>>>>>> t.txt
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> # Run complete. Total time: 00:10:24
>>>>>>>>>>>>>>> Benchmark                                    Mode
>>>>>>>>>>>>>>> Cnt        Score       Error  Units
>>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream1          avgt   50
>>>>>>>>>>>>>>> 1114999,207 ± 16756,776  ns/op
>>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream2          avgt   50
>>>>>>>>>>>>>>> 1118149,320 ± 17515,961  ns/op
>>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream3          avgt   50
>>>>>>>>>>>>>>> 1113678,657 ± 17652,314  ns/op
>>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream4          avgt   50
>>>>>>>>>>>>>>> 1112415,051 ± 18273,874  ns/op
>>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream5          avgt   50
>>>>>>>>>>>>>>> 1111366,583 ± 18282,829  ns/op
>>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStreamACSII   avgt   50
>>>>>>>>>>>>>>> 1112079,667 ± 16659,532  ns/op
>>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStreamUTFCustom  avgt   50
>>>>>>>>>>>>>>> 1114949,759 ± 16809,669  ns/op
>>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStreamUTFNIO        avgt   50
>>>>>>>>>>>>>>> 1121462,325 ± 19836,466  ns/op
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Is it OK? Whats the next step? Do I have to move this
>>>>>>>>>>>>>>> JMH benchmark to the Ignite project ?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Vadim Opolski
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko <
>>>>>>>>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Vadim,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm not sure I understand your benchmarks and how they
>>>>>>>>>>>>>>>> verify the optimization discussed here. Basically, here is what needs to be
>>>>>>>>>>>>>>>> done:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1. Create a benchmark for BinaryWriterExImpl#doWriteString
>>>>>>>>>>>>>>>> method.
>>>>>>>>>>>>>>>> 2. Run the benchmark with current implementation.
>>>>>>>>>>>>>>>> 3. Make the change described in the ticket.
>>>>>>>>>>>>>>>> 4. Run the benchmark with these changes.
>>>>>>>>>>>>>>>> 5. Compare results.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Makes sense? Let me know if anything is unclear.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский <
>>>>>>>>>>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hello everybody!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Valentin, I just have finished benchmark (with JMH) -
>>>>>>>>>>>>>>>>> https://github.com/javaller/MyBenchmark.git
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It collect data about time working of serialization.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> For instance - https://github.com/javaller/My
>>>>>>>>>>>>>>>>> Benchmark/blob/master/out200217.txt
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> To start it you have to do next:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 1) clone it - git colne https://github.com/javal
>>>>>>>>>>>>>>>>> ler/MyBenchmark.git
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2) install it - mvn install
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 3) run benchmarks -  java -Xms1024m -Xmx4096m -jar
>>>>>>>>>>>>>>>>> target\benchmarks.jar
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Vadim Opolski
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko <
>>>>>>>>>>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Vladimir,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I think we misunderstood each other. My understanding of
>>>>>>>>>>>>>>>>>> this optimization is the following.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Currently string serialization is done in two steps (see
>>>>>>>>>>>>>>>>>> BinaryWriterExImpl#doWriteString):
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode
>>>>>>>>>>>>>>>>>> string into byte array.
>>>>>>>>>>>>>>>>>> out.writeByteArray(strArr);                      // Write
>>>>>>>>>>>>>>>>>> byte array into stream.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> What this ticket suggests is to write directly into
>>>>>>>>>>>>>>>>>> stream while string is encoded, without intermediate array. This both
>>>>>>>>>>>>>>>>>> reduces memory consumption and eliminates array copy step.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I updated the ticket and added this explanation there.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Vadim, can you create a micro benchmark and check if it
>>>>>>>>>>>>>>>>>> gives any improvement?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov <
>>>>>>>>>>>>>>>>>> vozerov@gridgain.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> It is hard to say whether it makes sense or not. No
>>>>>>>>>>>>>>>>>>> doubt, it could speed up marshalling process at the cost of 2x memory
>>>>>>>>>>>>>>>>>>> required for strings. From my previous experience with marshalling
>>>>>>>>>>>>>>>>>>> micro-optimizations, we will hardly ever notice speedup in distributed
>>>>>>>>>>>>>>>>>>> environment.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> But, there is another sied - it could speedup our
>>>>>>>>>>>>>>>>>>> queries, because we will not have to unmarshal string on every field
>>>>>>>>>>>>>>>>>>> access. So I would try to make this optimization optional and then measure
>>>>>>>>>>>>>>>>>>> query performance with classes having lots of strings. It could give us
>>>>>>>>>>>>>>>>>>> interesting results.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko <
>>>>>>>>>>>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Vladimir,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Can you please take a look and provide your thoughts?
>>>>>>>>>>>>>>>>>>>> Can this be applied to binary marshaller? From what I recall, it serializes
>>>>>>>>>>>>>>>>>>>> string a bit differently from optimized marshaller, so I'm not sure.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan <
>>>>>>>>>>>>>>>>>>>> dsetrakyan@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko <
>>>>>>>>>>>>>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> > Hi Vadim,
>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>> > I don't think it makes much sense to invest into
>>>>>>>>>>>>>>>>>>>>> OptimizedMarshaller.
>>>>>>>>>>>>>>>>>>>>> > However, I would check if this optimization is
>>>>>>>>>>>>>>>>>>>>> applicable to
>>>>>>>>>>>>>>>>>>>>> > BinaryMarshaller, and if yes, implement it.
>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Val, in this case can you please update the ticket?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>> > -Val
>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский <
>>>>>>>>>>>>>>>>>>>>> vaopolskij@gmail.com>
>>>>>>>>>>>>>>>>>>>>> > wrote:
>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>> > > Dear sirs!
>>>>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>>>>> > > I want to resolve issue IGNITE-13 -
>>>>>>>>>>>>>>>>>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>>>>> > > Is it actual?
>>>>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>>>>> > > Vadim Opolski
>>>>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: IGNITE-13 (ready for review)

Posted by Вадим Опольский <va...@gmail.com>.
Hi Valentin,

I cant find out why it's better on long string, but worse on a short
string. May be it needs to use special tools like such as oracle solaris
studio performance analyzer or other.

I've added links to the benchmark, unit test and results to the ticket and
switched status to patch available.

Vadim Opolski



2017-03-09 13:57 GMT+03:00 Valentin Kulichenko <
valentin.kulichenko@gmail.com>:

> Hi Vadim,
>
> Results are a bit confusing. Any idea why it's better on long string, but
> worse on a short string? If that's actually the case, there is no any
> reason to make the change and I would just close the ticket.
>
> -Val
>
> On Thu, Mar 9, 2017 at 9:20 AM, Вадим Опольский <va...@gmail.com>
> wrote:
>
>> Hello everyone!
>>
>> Colleagues, take a look please at the results of measuring.
>>
>> Can I close this ticket ?
>>
>> Should I add JMH benchmark and unit test to Ignite project ?
>>
>> Results of measuring
>> https://github.com/javaller/mybenchmark/blob/master/out.txt
>>
>> Benchmark
>> https://github.com/javaller/mybenchmark/blob/master/src/main
>> /java/org/sample/ExampleTest.java
>>
>> UTest
>> https://github.com/javaller/mybenchmark/blob/master/src/main
>> /java/org/sample/BinaryMarshallerSelfTest.java
>>
>> *results of measuring*
>> Benchmark
>> (message)                                              Mode  Cnt
>> Score   Error  Units
>> LatchBenchmark.binaryHeapOutputStreamDirect
>> TestTestTestTestTestTestTestTestTest  avgt   50  128,036 ± 4,360  ns/op
>> LatchBenchmark.binaryHeapOutputStreamDirect
>> TestTestTest                    avgt   50    44,934 ± 1,463  ns/op
>> LatchBenchmark.binaryHeapOutputStreamDirect
>> Test                          avgt   50    21,254 ± 0,776  ns/op
>> LatchBenchmark.binaryHeapOutputStreamInDirect
>> TestTestTestTestTestTestTestTestTest avgt   50    83,262 ± 2,264  ns/op
>> LatchBenchmark.binaryHeapOutputStreamInDirect
>> TestTestTest                   avgt   50    58,975 ± 1,559  ns/op
>> LatchBenchmark.binaryHeapOutputStreamInDirect
>> Test                         avgt   50    48,506 ± 1,116  ns/op
>>
>>
>> Vadim
>>
>> 2017-03-06 19:42 GMT+03:00 Вадим Опольский <va...@gmail.com>:
>>
>>> Hello, everybody!
>>>
>>> Valentin, I've corrected benchmark and received the results:
>>>
>>> Benchmark
>>> (message)                                              Mode  Cnt
>>> Score   Error  Units
>>> LatchBenchmark.binaryHeapOutputStreamDirect
>>> TestTestTestTestTestTestTestTestTest  avgt   50  128,036 ± 4,360  ns/op
>>> LatchBenchmark.binaryHeapOutputStreamDirect
>>> TestTestTest                    avgt   50    44,934 ± 1,463  ns/op
>>> LatchBenchmark.binaryHeapOutputStreamDirect
>>> Test                          avgt   50    21,254 ± 0,776  ns/op
>>> LatchBenchmark.binaryHeapOutputStreamInDirect
>>> TestTestTestTestTestTestTestTestTest avgt   50    83,262 ± 2,264  ns/op
>>> LatchBenchmark.binaryHeapOutputStreamInDirect
>>> TestTestTest                   avgt   50    58,975 ± 1,559  ns/op
>>> LatchBenchmark.binaryHeapOutputStreamInDirect
>>> Test                         avgt   50    48,506 ± 1,116  ns/op
>>>
>>> https://github.com/javaller/MyBenchmark/blob/master/out_06_03_17_2.txt
>>>
>>> Whats the next step ?
>>>
>>>  Do I have to add benchmark to Ignite project ?
>>>
>>> Vadim Opolskiy
>>>
>>> 2017-03-03 21:11 GMT+03:00 Valentin Kulichenko <
>>> valentin.kulichenko@gmail.com>:
>>>
>>>> Hi Vadim,
>>>>
>>>> What do you mean by "copied benchmarks"? What changed singe previous
>>>> iteration and why results are so different?
>>>>
>>>> As for duplicated loop, you don't need it. BinaryOutputStream allows to
>>>> write a value to a particular position (even before already written data).
>>>> So you can reserve 4 bytes for length, remember position, calculate length
>>>> while encoding and writing bytes, and then write length.
>>>>
>>>> -Val
>>>>
>>>> On Fri, Mar 3, 2017 at 12:45 AM, Вадим Опольский <va...@gmail.com>
>>>> wrote:
>>>>
>>>>> Valentin,
>>>>>
>>>>> What do you think about duplicated cycle in strToBinaryOutputStream ?
>>>>>
>>>>> How to calculate StrLen для outBinaryHeap without this cycle ?
>>>>>
>>>>> public class BinaryUtilsNew extends BinaryUtils {
>>>>>
>>>>>     public static int getStrLen(String val) {
>>>>>         int strLen = val.length();
>>>>>         int utfLen = 0;
>>>>>         int c;
>>>>>
>>>>>         // Determine length of resulting byte array.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *for (int cnt = 0; cnt < strLen; cnt++) {            c = val.charAt(cnt);            if (c >= 0x0001 && c <= 0x007F)*                utfLen++;
>>>>>        *     else if (c > 0x07FF)*
>>>>>                 utfLen += 3;
>>>>>             else
>>>>>                 utfLen += 2;
>>>>>         }
>>>>>
>>>>>         return utfLen;
>>>>>     }
>>>>>
>>>>>     public static void strToUtf8BytesDirect(BinaryOutputStream outBinaryHeap, String val) {
>>>>>
>>>>>         int strLen = val.length();
>>>>>         int c, cnt;
>>>>>
>>>>>         int position = 0;
>>>>>
>>>>>         outBinaryHeap.unsafeEnsure(1 + 4);
>>>>>
>>>>> *   outBinaryHeap.unsafeWriteByte(GridBinaryMarshaller.STRING);        outBinaryHeap.unsafeWriteInt(getStrLen(val));*
>>>>>
>>>>>
>>>>>
>>>>> * for (cnt = 0; cnt < strLen; cnt++) {            c = val.charAt(cnt);*
>>>>>        *     if (c >= 0x0001 && c <= 0x007F)*
>>>>>                 outBinaryHeap.writeByte((byte) c);
>>>>>          *   else if (c > 0x07FF) {*
>>>>>                 outBinaryHeap.writeByte((byte)(0xE0 | (c >> 12) & 0x0F));
>>>>>                 outBinaryHeap.writeByte((byte)(0x80 | (c >> 6) & 0x3F));
>>>>>                 outBinaryHeap.writeByte((byte)(0x80 | (c & 0x3F)));
>>>>>             }
>>>>>             else {
>>>>>                 outBinaryHeap.writeByte((byte)(0xC0 | ((c >> 6) & 0x1F)));
>>>>>                 outBinaryHeap.writeByte((byte)(0x80 | (c  & 0x3F)));
>>>>>             }
>>>>>         }
>>>>>     }
>>>>>
>>>>>
>>>>> Vadim
>>>>>
>>>>>
>>>>>
>>>>> 2017-03-03 2:00 GMT+03:00 Valentin Kulichenko <
>>>>> valentin.kulichenko@gmail.com>:
>>>>>
>>>>>> Vadim,
>>>>>>
>>>>>> Looks better now. Can you also try to modify the benchmark so that
>>>>>> marshaller and writer are created outside of the measured method? I.e. the
>>>>>> benchmark methods should be as simple as this:
>>>>>>
>>>>>>     @Benchmark
>>>>>>     public void binaryHeapOutputStreamDirect() throws Exception {
>>>>>>         writer.doWriteStringDirect(message);
>>>>>>     }
>>>>>>
>>>>>>     @Benchmark
>>>>>>     public void binaryHeapOutputStreamInDirect() throws Exception {
>>>>>>         writer.doWriteString(message);
>>>>>>     }
>>>>>>
>>>>>> In any case, do I understand correctly that it didn't actually make
>>>>>> any performance difference? If so, I think we can close the ticket.
>>>>>>
>>>>>> Vova, can you also take a look and provide your thoughts?
>>>>>>
>>>>>> -Val
>>>>>>
>>>>>> On Thu, Mar 2, 2017 at 1:27 PM, Вадим Опольский <vaopolskij@gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> Hi Valentin!
>>>>>>>
>>>>>>> I've created:
>>>>>>>
>>>>>>> new method strToUtf8BytesDirect in BinaryUtilsNew
>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>> /java/org/sample/BinaryUtilsNew.java
>>>>>>>
>>>>>>> new method doWriteStringDirect in BinaryWriterExImplNew
>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>> /java/org/sample/BinaryWriterExImplNew.java
>>>>>>>
>>>>>>> benchmarks for BinaryWriterExImpl doWriteString and
>>>>>>> BinaryWriterExImplNew  doWriteStringDirect
>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>> /java/org/sample/ExampleTest.java
>>>>>>>
>>>>>>> This is a result of comparing:
>>>>>>>
>>>>>>> Benchmark
>>>>>>> Mode  Cnt   Score               Error         UnitsExampleTest.binaryHeapOutputStreamDirect
>>>>>>> avgt   50  1128448,743 ± 13536,689  ns/opExampleTest.binaryHeapOutputStreamInDirect
>>>>>>> avgt   50  1127270,695 ± 17309,256  ns/op
>>>>>>>
>>>>>>> Vadim
>>>>>>>
>>>>>>> 2017-03-02 1:02 GMT+03:00 Valentin Kulichenko <
>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>
>>>>>>>> Hi Vadim,
>>>>>>>>
>>>>>>>> We're getting closer :) I would actually like to see the test for
>>>>>>>> actual implementation of BinaryWriterExImpl#doWriteString method.
>>>>>>>> Logic in binaryHeapOutputInDirect() confuses me a bit and I'm not sure
>>>>>>>> comparison is valid.
>>>>>>>>
>>>>>>>> Can you please do the following:
>>>>>>>>
>>>>>>>> 1. Create new BinaryUtils#strToUtf8BytesDirect method, copy-paste
>>>>>>>> the code from existing BinaryUtils#strToUtf8Bytes and modify it so that it
>>>>>>>> takes BinaryOutputStream as an argument and writes to it directly. Do not
>>>>>>>> create stream inside this method, as it's the same as creating new array.
>>>>>>>> 2. Create new BinaryWriterExImpl#doWriteStringDirect, copy-paste
>>>>>>>> the code from existing BinaryWriterExImpl#doWriteString and modify
>>>>>>>> it so that it uses BinaryUtils#strToUtf8BytesDirect and doesn't
>>>>>>>> call out.writeByteArray.
>>>>>>>> 3. Create benchmark for BinaryWriterExImpl#doWriteString method.
>>>>>>>> I.e., create an instance of BinaryWriterExImpl and call doWriteString() in
>>>>>>>> benchmark method.
>>>>>>>> 4. Similarly, create benchmark for BinaryWriterExImpl#doWriteStri
>>>>>>>> ngDirect.
>>>>>>>> 5. Compare results.
>>>>>>>>
>>>>>>>> This will give us clear picture of how these two approaches
>>>>>>>> perform. Your current results are actually promising, but I would like to
>>>>>>>> confirm them.
>>>>>>>>
>>>>>>>> -Val
>>>>>>>>
>>>>>>>> On Wed, Mar 1, 2017 at 6:17 AM, Вадим Опольский <
>>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Valentin!
>>>>>>>>>
>>>>>>>>> Thank you for comments.
>>>>>>>>>
>>>>>>>>> There is a new method which writes directly to BinaryOutputStream
>>>>>>>>> instead of intermediate array.
>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>> /java/org/sample/BinaryUtilsNew.java
>>>>>>>>>
>>>>>>>>> There is benchmark.
>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>> /java/org/sample/MyBenchmark.java
>>>>>>>>>
>>>>>>>>> Unit test
>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>> /java/org/sample/BinaryOutputStreamTest.java
>>>>>>>>>
>>>>>>>>> Statistics
>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/out_01_0
>>>>>>>>> 3_17.txt
>>>>>>>>>
>>>>>>>>> Benchmark
>>>>>>>>>  Mode       Cnt    Score        Error  Units MyBenchmark.binaryHeapOutputIn
>>>>>>>>> Direct            avgt          50  111,337 ± 0,742  ns/op
>>>>>>>>> MyBenchmark.binaryHeapOutputStreamDirect   avgt          50
>>>>>>>>> 23,847 ± 0,303    ns/op
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Vadim
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2017-02-28 4:29 GMT+03:00 Valentin Kulichenko <
>>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>>
>>>>>>>>>> Hi Vadim,
>>>>>>>>>>
>>>>>>>>>> Looks like you accidentally removed dev list from the thread,
>>>>>>>>>> adding it back.
>>>>>>>>>>
>>>>>>>>>> I think there is still misunderstanding. What I propose is to
>>>>>>>>>> modify the BinaryUtils#strToUtf8Bytes so that it writes directly to BinaryOutputStream
>>>>>>>>>> instead of intermediate array. This should decrease memory consumption and
>>>>>>>>>> can also increase performance as we will avoid 'writeByteArray'
>>>>>>>>>> step at the end.
>>>>>>>>>>
>>>>>>>>>> Does it make sense to you?
>>>>>>>>>>
>>>>>>>>>> -Val
>>>>>>>>>>
>>>>>>>>>> On Mon, Feb 27, 2017 at 6:55 AM, Вадим Опольский <
>>>>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi, Valentin!
>>>>>>>>>>>
>>>>>>>>>>> What do you think about using the methods of BinaryOutputStream:
>>>>>>>>>>>
>>>>>>>>>>> 1) writeByteArray(byte[] val)
>>>>>>>>>>> 2) writeCharArray(char[] val)
>>>>>>>>>>> 3) write (byte[] arr, int off, int len)
>>>>>>>>>>>
>>>>>>>>>>> String val = "Test";
>>>>>>>>>>>     out.writeByteArray( val.getBytes(UTF_8));
>>>>>>>>>>>
>>>>>>>>>>>  String val = "Test";
>>>>>>>>>>>     out.writeCharArray(str.toCharArray());
>>>>>>>>>>>
>>>>>>>>>>> String val = "Test"
>>>>>>>>>>> InputStream stream = new ByteArrayInputStream(
>>>>>>>>>>> exampleString.getBytes(StandartCharsets.UTF_8));
>>>>>>>>>>> byte[] buffer = new byte[1024];
>>>>>>>>>>> while ((buffer = stream.read()) != -1) {
>>>>>>>>>>> out.writeByteArray(buffer);
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> What else can we use ?
>>>>>>>>>>>
>>>>>>>>>>> Vadim
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2017-02-25 2:21 GMT+03:00 Valentin Kulichenko <
>>>>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Vadim,
>>>>>>>>>>>>
>>>>>>>>>>>> Which method implements the approach described in the ticket?
>>>>>>>>>>>> From what I see, all writeToStringX versions are still encoding into an
>>>>>>>>>>>> intermediate array and then call out.writeByteArray. What we need to test
>>>>>>>>>>>> is the approach where bytes are written directly into the stream during
>>>>>>>>>>>> encoding. Encoding algorithm itself should stay the same for now, otherwise
>>>>>>>>>>>> we will not know how to interpret the result.
>>>>>>>>>>>>
>>>>>>>>>>>> It looks like there is some misunderstanding here, so please
>>>>>>>>>>>> let me know anything is still unclear. I will be happy to answer your
>>>>>>>>>>>> questions.
>>>>>>>>>>>>
>>>>>>>>>>>> -Val
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 22, 2017 at 7:22 PM, Valentin Kulichenko <
>>>>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Vadim,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks, I will review this week.
>>>>>>>>>>>>>
>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Feb 22, 2017 at 2:28 AM, Вадим Опольский <
>>>>>>>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Valentin!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I created BinaryWriterExImplNew (extended of
>>>>>>>>>>>>>> BinaryWriterExImpl) and added new methods with changes
>>>>>>>>>>>>>> described in the ticket
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>>>>>>> /java/org/sample/BinaryWriterExImplNew.java
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I created a benchmark for BinaryWriterExImplNew
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>>>>>>> /java/org/sample/ExampleTest.java
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I run benchmark and compared results
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/totalsta
>>>>>>>>>>>>>> t.txt
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # Run complete. Total time: 00:10:24
>>>>>>>>>>>>>> Benchmark                                    Mode
>>>>>>>>>>>>>> Cnt        Score       Error  Units
>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream1          avgt   50
>>>>>>>>>>>>>> 1114999,207 ± 16756,776  ns/op
>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream2          avgt   50
>>>>>>>>>>>>>> 1118149,320 ± 17515,961  ns/op
>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream3          avgt   50
>>>>>>>>>>>>>> 1113678,657 ± 17652,314  ns/op
>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream4          avgt   50
>>>>>>>>>>>>>> 1112415,051 ± 18273,874  ns/op
>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream5          avgt   50
>>>>>>>>>>>>>> 1111366,583 ± 18282,829  ns/op
>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStreamACSII   avgt   50
>>>>>>>>>>>>>> 1112079,667 ± 16659,532  ns/op
>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStreamUTFCustom  avgt   50
>>>>>>>>>>>>>> 1114949,759 ± 16809,669  ns/op
>>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStreamUTFNIO        avgt   50
>>>>>>>>>>>>>> 1121462,325 ± 19836,466  ns/op
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Is it OK? Whats the next step? Do I have to move this
>>>>>>>>>>>>>> JMH benchmark to the Ignite project ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Vadim Opolski
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko <
>>>>>>>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Vadim,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm not sure I understand your benchmarks and how they
>>>>>>>>>>>>>>> verify the optimization discussed here. Basically, here is what needs to be
>>>>>>>>>>>>>>> done:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1. Create a benchmark for BinaryWriterExImpl#doWriteString
>>>>>>>>>>>>>>> method.
>>>>>>>>>>>>>>> 2. Run the benchmark with current implementation.
>>>>>>>>>>>>>>> 3. Make the change described in the ticket.
>>>>>>>>>>>>>>> 4. Run the benchmark with these changes.
>>>>>>>>>>>>>>> 5. Compare results.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Makes sense? Let me know if anything is unclear.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский <
>>>>>>>>>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hello everybody!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Valentin, I just have finished benchmark (with JMH) -
>>>>>>>>>>>>>>>> https://github.com/javaller/MyBenchmark.git
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It collect data about time working of serialization.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For instance - https://github.com/javaller/My
>>>>>>>>>>>>>>>> Benchmark/blob/master/out200217.txt
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> To start it you have to do next:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1) clone it - git colne https://github.com/javal
>>>>>>>>>>>>>>>> ler/MyBenchmark.git
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2) install it - mvn install
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 3) run benchmarks -  java -Xms1024m -Xmx4096m -jar
>>>>>>>>>>>>>>>> target\benchmarks.jar
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Vadim Opolski
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko <
>>>>>>>>>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Vladimir,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I think we misunderstood each other. My understanding of
>>>>>>>>>>>>>>>>> this optimization is the following.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Currently string serialization is done in two steps (see
>>>>>>>>>>>>>>>>> BinaryWriterExImpl#doWriteString):
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode
>>>>>>>>>>>>>>>>> string into byte array.
>>>>>>>>>>>>>>>>> out.writeByteArray(strArr);                      // Write
>>>>>>>>>>>>>>>>> byte array into stream.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> What this ticket suggests is to write directly into stream
>>>>>>>>>>>>>>>>> while string is encoded, without intermediate array. This both reduces
>>>>>>>>>>>>>>>>> memory consumption and eliminates array copy step.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I updated the ticket and added this explanation there.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Vadim, can you create a micro benchmark and check if it
>>>>>>>>>>>>>>>>> gives any improvement?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov <
>>>>>>>>>>>>>>>>> vozerov@gridgain.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> It is hard to say whether it makes sense or not. No
>>>>>>>>>>>>>>>>>> doubt, it could speed up marshalling process at the cost of 2x memory
>>>>>>>>>>>>>>>>>> required for strings. From my previous experience with marshalling
>>>>>>>>>>>>>>>>>> micro-optimizations, we will hardly ever notice speedup in distributed
>>>>>>>>>>>>>>>>>> environment.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> But, there is another sied - it could speedup our
>>>>>>>>>>>>>>>>>> queries, because we will not have to unmarshal string on every field
>>>>>>>>>>>>>>>>>> access. So I would try to make this optimization optional and then measure
>>>>>>>>>>>>>>>>>> query performance with classes having lots of strings. It could give us
>>>>>>>>>>>>>>>>>> interesting results.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko <
>>>>>>>>>>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Vladimir,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Can you please take a look and provide your thoughts?
>>>>>>>>>>>>>>>>>>> Can this be applied to binary marshaller? From what I recall, it serializes
>>>>>>>>>>>>>>>>>>> string a bit differently from optimized marshaller, so I'm not sure.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan <
>>>>>>>>>>>>>>>>>>> dsetrakyan@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko <
>>>>>>>>>>>>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> > Hi Vadim,
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> > I don't think it makes much sense to invest into
>>>>>>>>>>>>>>>>>>>> OptimizedMarshaller.
>>>>>>>>>>>>>>>>>>>> > However, I would check if this optimization is
>>>>>>>>>>>>>>>>>>>> applicable to
>>>>>>>>>>>>>>>>>>>> > BinaryMarshaller, and if yes, implement it.
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Val, in this case can you please update the ticket?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> > -Val
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский <
>>>>>>>>>>>>>>>>>>>> vaopolskij@gmail.com>
>>>>>>>>>>>>>>>>>>>> > wrote:
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> > > Dear sirs!
>>>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>>>> > > I want to resolve issue IGNITE-13 -
>>>>>>>>>>>>>>>>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>>>> > > Is it actual?
>>>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>>>> > > Vadim Opolski
>>>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: IGNITE-13 (ready for review)

Posted by Valentin Kulichenko <va...@gmail.com>.
Hi Vadim,

Results are a bit confusing. Any idea why it's better on long string, but
worse on a short string? If that's actually the case, there is no any
reason to make the change and I would just close the ticket.

-Val

On Thu, Mar 9, 2017 at 9:20 AM, Вадим Опольский <va...@gmail.com>
wrote:

> Hello everyone!
>
> Colleagues, take a look please at the results of measuring.
>
> Can I close this ticket ?
>
> Should I add JMH benchmark and unit test to Ignite project ?
>
> Results of measuring
> https://github.com/javaller/mybenchmark/blob/master/out.txt
>
> Benchmark
> https://github.com/javaller/mybenchmark/blob/master/src/
> main/java/org/sample/ExampleTest.java
>
> UTest
> https://github.com/javaller/mybenchmark/blob/master/src/
> main/java/org/sample/BinaryMarshallerSelfTest.java
>
> *results of measuring*
> Benchmark
> (message)                                              Mode  Cnt
> Score   Error  Units
> LatchBenchmark.binaryHeapOutputStreamDirect
> TestTestTestTestTestTestTestTestTest  avgt   50  128,036 ± 4,360  ns/op
> LatchBenchmark.binaryHeapOutputStreamDirect
> TestTestTest                    avgt   50    44,934 ± 1,463  ns/op
> LatchBenchmark.binaryHeapOutputStreamDirect
> Test                          avgt   50    21,254 ± 0,776  ns/op
> LatchBenchmark.binaryHeapOutputStreamInDirect
> TestTestTestTestTestTestTestTestTest avgt   50    83,262 ± 2,264  ns/op
> LatchBenchmark.binaryHeapOutputStreamInDirect
> TestTestTest                   avgt   50    58,975 ± 1,559  ns/op
> LatchBenchmark.binaryHeapOutputStreamInDirect
> Test                         avgt   50    48,506 ± 1,116  ns/op
>
>
> Vadim
>
> 2017-03-06 19:42 GMT+03:00 Вадим Опольский <va...@gmail.com>:
>
>> Hello, everybody!
>>
>> Valentin, I've corrected benchmark and received the results:
>>
>> Benchmark
>> (message)                                              Mode  Cnt
>> Score   Error  Units
>> LatchBenchmark.binaryHeapOutputStreamDirect
>> TestTestTestTestTestTestTestTestTest  avgt   50  128,036 ± 4,360  ns/op
>> LatchBenchmark.binaryHeapOutputStreamDirect
>> TestTestTest                    avgt   50    44,934 ± 1,463  ns/op
>> LatchBenchmark.binaryHeapOutputStreamDirect
>> Test                          avgt   50    21,254 ± 0,776  ns/op
>> LatchBenchmark.binaryHeapOutputStreamInDirect
>> TestTestTestTestTestTestTestTestTest avgt   50    83,262 ± 2,264  ns/op
>> LatchBenchmark.binaryHeapOutputStreamInDirect
>> TestTestTest                   avgt   50    58,975 ± 1,559  ns/op
>> LatchBenchmark.binaryHeapOutputStreamInDirect
>> Test                         avgt   50    48,506 ± 1,116  ns/op
>>
>> https://github.com/javaller/MyBenchmark/blob/master/out_06_03_17_2.txt
>>
>> Whats the next step ?
>>
>>  Do I have to add benchmark to Ignite project ?
>>
>> Vadim Opolskiy
>>
>> 2017-03-03 21:11 GMT+03:00 Valentin Kulichenko <
>> valentin.kulichenko@gmail.com>:
>>
>>> Hi Vadim,
>>>
>>> What do you mean by "copied benchmarks"? What changed singe previous
>>> iteration and why results are so different?
>>>
>>> As for duplicated loop, you don't need it. BinaryOutputStream allows to
>>> write a value to a particular position (even before already written data).
>>> So you can reserve 4 bytes for length, remember position, calculate length
>>> while encoding and writing bytes, and then write length.
>>>
>>> -Val
>>>
>>> On Fri, Mar 3, 2017 at 12:45 AM, Вадим Опольский <va...@gmail.com>
>>> wrote:
>>>
>>>> Valentin,
>>>>
>>>> What do you think about duplicated cycle in strToBinaryOutputStream ?
>>>>
>>>> How to calculate StrLen для outBinaryHeap without this cycle ?
>>>>
>>>> public class BinaryUtilsNew extends BinaryUtils {
>>>>
>>>>     public static int getStrLen(String val) {
>>>>         int strLen = val.length();
>>>>         int utfLen = 0;
>>>>         int c;
>>>>
>>>>         // Determine length of resulting byte array.
>>>>
>>>>
>>>>
>>>>
>>>> *for (int cnt = 0; cnt < strLen; cnt++) {            c = val.charAt(cnt);            if (c >= 0x0001 && c <= 0x007F)*                utfLen++;
>>>>        *     else if (c > 0x07FF)*
>>>>                 utfLen += 3;
>>>>             else
>>>>                 utfLen += 2;
>>>>         }
>>>>
>>>>         return utfLen;
>>>>     }
>>>>
>>>>     public static void strToUtf8BytesDirect(BinaryOutputStream outBinaryHeap, String val) {
>>>>
>>>>         int strLen = val.length();
>>>>         int c, cnt;
>>>>
>>>>         int position = 0;
>>>>
>>>>         outBinaryHeap.unsafeEnsure(1 + 4);
>>>>
>>>> *   outBinaryHeap.unsafeWriteByte(GridBinaryMarshaller.STRING);        outBinaryHeap.unsafeWriteInt(getStrLen(val));*
>>>>
>>>>
>>>>
>>>> * for (cnt = 0; cnt < strLen; cnt++) {            c = val.charAt(cnt);*
>>>>        *     if (c >= 0x0001 && c <= 0x007F)*
>>>>                 outBinaryHeap.writeByte((byte) c);
>>>>          *   else if (c > 0x07FF) {*
>>>>                 outBinaryHeap.writeByte((byte)(0xE0 | (c >> 12) & 0x0F));
>>>>                 outBinaryHeap.writeByte((byte)(0x80 | (c >> 6) & 0x3F));
>>>>                 outBinaryHeap.writeByte((byte)(0x80 | (c & 0x3F)));
>>>>             }
>>>>             else {
>>>>                 outBinaryHeap.writeByte((byte)(0xC0 | ((c >> 6) & 0x1F)));
>>>>                 outBinaryHeap.writeByte((byte)(0x80 | (c  & 0x3F)));
>>>>             }
>>>>         }
>>>>     }
>>>>
>>>>
>>>> Vadim
>>>>
>>>>
>>>>
>>>> 2017-03-03 2:00 GMT+03:00 Valentin Kulichenko <
>>>> valentin.kulichenko@gmail.com>:
>>>>
>>>>> Vadim,
>>>>>
>>>>> Looks better now. Can you also try to modify the benchmark so that
>>>>> marshaller and writer are created outside of the measured method? I.e. the
>>>>> benchmark methods should be as simple as this:
>>>>>
>>>>>     @Benchmark
>>>>>     public void binaryHeapOutputStreamDirect() throws Exception {
>>>>>         writer.doWriteStringDirect(message);
>>>>>     }
>>>>>
>>>>>     @Benchmark
>>>>>     public void binaryHeapOutputStreamInDirect() throws Exception {
>>>>>         writer.doWriteString(message);
>>>>>     }
>>>>>
>>>>> In any case, do I understand correctly that it didn't actually make
>>>>> any performance difference? If so, I think we can close the ticket.
>>>>>
>>>>> Vova, can you also take a look and provide your thoughts?
>>>>>
>>>>> -Val
>>>>>
>>>>> On Thu, Mar 2, 2017 at 1:27 PM, Вадим Опольский <va...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Valentin!
>>>>>>
>>>>>> I've created:
>>>>>>
>>>>>> new method strToUtf8BytesDirect in BinaryUtilsNew
>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>> /java/org/sample/BinaryUtilsNew.java
>>>>>>
>>>>>> new method doWriteStringDirect in BinaryWriterExImplNew
>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>> /java/org/sample/BinaryWriterExImplNew.java
>>>>>>
>>>>>> benchmarks for BinaryWriterExImpl doWriteString and
>>>>>> BinaryWriterExImplNew  doWriteStringDirect
>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>> /java/org/sample/ExampleTest.java
>>>>>>
>>>>>> This is a result of comparing:
>>>>>>
>>>>>> Benchmark
>>>>>> Mode  Cnt   Score               Error         UnitsExampleTest.binaryHeapOutputStreamDirect
>>>>>> avgt   50  1128448,743 ± 13536,689  ns/opExampleTest.binaryHeapOutputStreamInDirect
>>>>>> avgt   50  1127270,695 ± 17309,256  ns/op
>>>>>>
>>>>>> Vadim
>>>>>>
>>>>>> 2017-03-02 1:02 GMT+03:00 Valentin Kulichenko <
>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>
>>>>>>> Hi Vadim,
>>>>>>>
>>>>>>> We're getting closer :) I would actually like to see the test for
>>>>>>> actual implementation of BinaryWriterExImpl#doWriteString method.
>>>>>>> Logic in binaryHeapOutputInDirect() confuses me a bit and I'm not sure
>>>>>>> comparison is valid.
>>>>>>>
>>>>>>> Can you please do the following:
>>>>>>>
>>>>>>> 1. Create new BinaryUtils#strToUtf8BytesDirect method, copy-paste
>>>>>>> the code from existing BinaryUtils#strToUtf8Bytes and modify it so that it
>>>>>>> takes BinaryOutputStream as an argument and writes to it directly. Do not
>>>>>>> create stream inside this method, as it's the same as creating new array.
>>>>>>> 2. Create new BinaryWriterExImpl#doWriteStringDirect, copy-paste
>>>>>>> the code from existing BinaryWriterExImpl#doWriteString and modify
>>>>>>> it so that it uses BinaryUtils#strToUtf8BytesDirect and doesn't
>>>>>>> call out.writeByteArray.
>>>>>>> 3. Create benchmark for BinaryWriterExImpl#doWriteString method.
>>>>>>> I.e., create an instance of BinaryWriterExImpl and call doWriteString() in
>>>>>>> benchmark method.
>>>>>>> 4. Similarly, create benchmark for BinaryWriterExImpl#doWriteStri
>>>>>>> ngDirect.
>>>>>>> 5. Compare results.
>>>>>>>
>>>>>>> This will give us clear picture of how these two approaches perform.
>>>>>>> Your current results are actually promising, but I would like to confirm
>>>>>>> them.
>>>>>>>
>>>>>>> -Val
>>>>>>>
>>>>>>> On Wed, Mar 1, 2017 at 6:17 AM, Вадим Опольский <
>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Valentin!
>>>>>>>>
>>>>>>>> Thank you for comments.
>>>>>>>>
>>>>>>>> There is a new method which writes directly to BinaryOutputStream
>>>>>>>> instead of intermediate array.
>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>> /java/org/sample/BinaryUtilsNew.java
>>>>>>>>
>>>>>>>> There is benchmark.
>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>> /java/org/sample/MyBenchmark.java
>>>>>>>>
>>>>>>>> Unit test
>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>> /java/org/sample/BinaryOutputStreamTest.java
>>>>>>>>
>>>>>>>> Statistics
>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/out_01_0
>>>>>>>> 3_17.txt
>>>>>>>>
>>>>>>>> Benchmark
>>>>>>>>  Mode       Cnt    Score        Error  Units MyBenchmark.binaryHeapOutputIn
>>>>>>>> Direct            avgt          50  111,337 ± 0,742  ns/op
>>>>>>>> MyBenchmark.binaryHeapOutputStreamDirect   avgt          50
>>>>>>>> 23,847 ± 0,303    ns/op
>>>>>>>>
>>>>>>>>
>>>>>>>> Vadim
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2017-02-28 4:29 GMT+03:00 Valentin Kulichenko <
>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>
>>>>>>>>> Hi Vadim,
>>>>>>>>>
>>>>>>>>> Looks like you accidentally removed dev list from the thread,
>>>>>>>>> adding it back.
>>>>>>>>>
>>>>>>>>> I think there is still misunderstanding. What I propose is to
>>>>>>>>> modify the BinaryUtils#strToUtf8Bytes so that it writes directly to BinaryOutputStream
>>>>>>>>> instead of intermediate array. This should decrease memory consumption and
>>>>>>>>> can also increase performance as we will avoid 'writeByteArray'
>>>>>>>>> step at the end.
>>>>>>>>>
>>>>>>>>> Does it make sense to you?
>>>>>>>>>
>>>>>>>>> -Val
>>>>>>>>>
>>>>>>>>> On Mon, Feb 27, 2017 at 6:55 AM, Вадим Опольский <
>>>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi, Valentin!
>>>>>>>>>>
>>>>>>>>>> What do you think about using the methods of BinaryOutputStream:
>>>>>>>>>>
>>>>>>>>>> 1) writeByteArray(byte[] val)
>>>>>>>>>> 2) writeCharArray(char[] val)
>>>>>>>>>> 3) write (byte[] arr, int off, int len)
>>>>>>>>>>
>>>>>>>>>> String val = "Test";
>>>>>>>>>>     out.writeByteArray( val.getBytes(UTF_8));
>>>>>>>>>>
>>>>>>>>>>  String val = "Test";
>>>>>>>>>>     out.writeCharArray(str.toCharArray());
>>>>>>>>>>
>>>>>>>>>> String val = "Test"
>>>>>>>>>> InputStream stream = new ByteArrayInputStream(
>>>>>>>>>> exampleString.getBytes(StandartCharsets.UTF_8));
>>>>>>>>>> byte[] buffer = new byte[1024];
>>>>>>>>>> while ((buffer = stream.read()) != -1) {
>>>>>>>>>> out.writeByteArray(buffer);
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> What else can we use ?
>>>>>>>>>>
>>>>>>>>>> Vadim
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2017-02-25 2:21 GMT+03:00 Valentin Kulichenko <
>>>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>>>
>>>>>>>>>>> Hi Vadim,
>>>>>>>>>>>
>>>>>>>>>>> Which method implements the approach described in the ticket?
>>>>>>>>>>> From what I see, all writeToStringX versions are still encoding into an
>>>>>>>>>>> intermediate array and then call out.writeByteArray. What we need to test
>>>>>>>>>>> is the approach where bytes are written directly into the stream during
>>>>>>>>>>> encoding. Encoding algorithm itself should stay the same for now, otherwise
>>>>>>>>>>> we will not know how to interpret the result.
>>>>>>>>>>>
>>>>>>>>>>> It looks like there is some misunderstanding here, so please let
>>>>>>>>>>> me know anything is still unclear. I will be happy to answer your questions.
>>>>>>>>>>>
>>>>>>>>>>> -Val
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 22, 2017 at 7:22 PM, Valentin Kulichenko <
>>>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Vadim,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks, I will review this week.
>>>>>>>>>>>>
>>>>>>>>>>>> -Val
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 22, 2017 at 2:28 AM, Вадим Опольский <
>>>>>>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Valentin!
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>>>>>>>
>>>>>>>>>>>>> I created BinaryWriterExImplNew (extended of BinaryWriterExImpl) and
>>>>>>>>>>>>> added new methods with changes described in the ticket
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>>>>>> /java/org/sample/BinaryWriterExImplNew.java
>>>>>>>>>>>>>
>>>>>>>>>>>>> I created a benchmark for BinaryWriterExImplNew
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>>>>>>>> /java/org/sample/ExampleTest.java
>>>>>>>>>>>>>
>>>>>>>>>>>>> I run benchmark and compared results
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/totalsta
>>>>>>>>>>>>> t.txt
>>>>>>>>>>>>>
>>>>>>>>>>>>> # Run complete. Total time: 00:10:24
>>>>>>>>>>>>> Benchmark                                    Mode  Cnt
>>>>>>>>>>>>> Score       Error  Units
>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream1          avgt   50
>>>>>>>>>>>>> 1114999,207 ± 16756,776  ns/op
>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream2          avgt   50
>>>>>>>>>>>>> 1118149,320 ± 17515,961  ns/op
>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream3          avgt   50
>>>>>>>>>>>>> 1113678,657 ± 17652,314  ns/op
>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream4          avgt   50
>>>>>>>>>>>>> 1112415,051 ± 18273,874  ns/op
>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStream5          avgt   50
>>>>>>>>>>>>> 1111366,583 ± 18282,829  ns/op
>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStreamACSII   avgt   50
>>>>>>>>>>>>> 1112079,667 ± 16659,532  ns/op
>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStreamUTFCustom  avgt   50
>>>>>>>>>>>>> 1114949,759 ± 16809,669  ns/op
>>>>>>>>>>>>> ExampleTest.binaryHeapOutputStreamUTFNIO        avgt   50
>>>>>>>>>>>>> 1121462,325 ± 19836,466  ns/op
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is it OK? Whats the next step? Do I have to move this
>>>>>>>>>>>>> JMH benchmark to the Ignite project ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Vadim Opolski
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko <
>>>>>>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Vadim,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm not sure I understand your benchmarks and how they verify
>>>>>>>>>>>>>> the optimization discussed here. Basically, here is what needs to be done:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1. Create a benchmark for BinaryWriterExImpl#doWriteString
>>>>>>>>>>>>>> method.
>>>>>>>>>>>>>> 2. Run the benchmark with current implementation.
>>>>>>>>>>>>>> 3. Make the change described in the ticket.
>>>>>>>>>>>>>> 4. Run the benchmark with these changes.
>>>>>>>>>>>>>> 5. Compare results.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Makes sense? Let me know if anything is unclear.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский <
>>>>>>>>>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hello everybody!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Valentin, I just have finished benchmark (with JMH) -
>>>>>>>>>>>>>>> https://github.com/javaller/MyBenchmark.git
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It collect data about time working of serialization.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For instance - https://github.com/javaller/My
>>>>>>>>>>>>>>> Benchmark/blob/master/out200217.txt
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To start it you have to do next:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1) clone it - git colne https://github.com/javal
>>>>>>>>>>>>>>> ler/MyBenchmark.git
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2) install it - mvn install
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 3) run benchmarks -  java -Xms1024m -Xmx4096m -jar
>>>>>>>>>>>>>>> target\benchmarks.jar
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Vadim Opolski
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko <
>>>>>>>>>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Vladimir,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think we misunderstood each other. My understanding of
>>>>>>>>>>>>>>>> this optimization is the following.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Currently string serialization is done in two steps (see
>>>>>>>>>>>>>>>> BinaryWriterExImpl#doWriteString):
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string
>>>>>>>>>>>>>>>> into byte array.
>>>>>>>>>>>>>>>> out.writeByteArray(strArr);                      // Write
>>>>>>>>>>>>>>>> byte array into stream.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> What this ticket suggests is to write directly into stream
>>>>>>>>>>>>>>>> while string is encoded, without intermediate array. This both reduces
>>>>>>>>>>>>>>>> memory consumption and eliminates array copy step.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I updated the ticket and added this explanation there.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Vadim, can you create a micro benchmark and check if it
>>>>>>>>>>>>>>>> gives any improvement?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov <
>>>>>>>>>>>>>>>> vozerov@gridgain.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It is hard to say whether it makes sense or not. No doubt,
>>>>>>>>>>>>>>>>> it could speed up marshalling process at the cost of 2x memory required for
>>>>>>>>>>>>>>>>> strings. From my previous experience with marshalling micro-optimizations,
>>>>>>>>>>>>>>>>> we will hardly ever notice speedup in distributed environment.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> But, there is another sied - it could speedup our queries,
>>>>>>>>>>>>>>>>> because we will not have to unmarshal string on every field access. So I
>>>>>>>>>>>>>>>>> would try to make this optimization optional and then measure query
>>>>>>>>>>>>>>>>> performance with classes having lots of strings. It could give us
>>>>>>>>>>>>>>>>> interesting results.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko <
>>>>>>>>>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Vladimir,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Can you please take a look and provide your thoughts? Can
>>>>>>>>>>>>>>>>>> this be applied to binary marshaller? From what I recall, it serializes
>>>>>>>>>>>>>>>>>> string a bit differently from optimized marshaller, so I'm not sure.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -Val
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan <
>>>>>>>>>>>>>>>>>> dsetrakyan@apache.org> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko <
>>>>>>>>>>>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> > Hi Vadim,
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > I don't think it makes much sense to invest into
>>>>>>>>>>>>>>>>>>> OptimizedMarshaller.
>>>>>>>>>>>>>>>>>>> > However, I would check if this optimization is
>>>>>>>>>>>>>>>>>>> applicable to
>>>>>>>>>>>>>>>>>>> > BinaryMarshaller, and if yes, implement it.
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Val, in this case can you please update the ticket?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > -Val
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский <
>>>>>>>>>>>>>>>>>>> vaopolskij@gmail.com>
>>>>>>>>>>>>>>>>>>> > wrote:
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > > Dear sirs!
>>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>>> > > I want to resolve issue IGNITE-13 -
>>>>>>>>>>>>>>>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>>> > > Is it actual?
>>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>>> > > Vadim Opolski
>>>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>