You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@ignite.apache.org by GitBox <gi...@apache.org> on 2019/06/05 10:04:21 UTC

[GitHub] [ignite] kulinskyvs commented on issue #6553: IGNITE-11854 Fix performance problem when creating arrays

kulinskyvs commented on issue #6553: IGNITE-11854 Fix performance problem when creating arrays
URL: https://github.com/apache/ignite/pull/6553#issuecomment-499021545
 
 
   I've tried the patched and it looks like something is still wrong. 
   Below is a simple test case that I use to test the patch (motivated by https://issues.apache.org/jira/browse/IGNITE-11854):
   
   ```
   for i in range(9):
           arr_len = 3_000_000 * (i + 1)
           content = bytearray(arr_len)
   
           for i in range(arr_len):
               content[i] = i % 256
   
           start_time = time.time()
           my_cache.put("key_bin", content, value_hint=ByteArrayObject)
           put_elapsed = time.time() - start_time
   
           start_time = time.time()
           my_cache.get("key_bin")
           get_elapsed = time.time() - start_time
           print("size: {}, put time:{} secs, get time: {} secs".format(len(content), put_elapsed, get_elapsed))
   ```
   And below the results I've got:
   
   > size: 3000000, put time:1.118593692779541 secs, get time: 0.5311076641082764 secs
   size: 6000000, put time:2.2962043285369873 secs, get time: 1.0077407360076904 secs
   size: 9000000, put time:3.3157756328582764 secs, get time: 1.4834346771240234 secs
   size: 12000000, put time:4.277689695358276 secs, get time: 2.100264549255371 secs
   size: 15000000, put time:5.646029949188232 secs, get time: 2.890953540802002 secs
   size: 18000000, put time:6.578389883041382 secs, get time: 2.971928596496582 secs
   size: 21000000, put time:8.114448547363281 secs, get time: 3.5916993618011475 secs
   size: 24000000, put time:8.961003541946411 secs, get time: 4.091071605682373 secs
   size: 27000000, put time:10.465842247009277 secs, get time: 5.260945081710815 secs
   size: 30000000, put time:10.91893482208252 secs, get time: 5.647061586380005 secs
   
   So, it takes more than 10 secs to put 30MB into Ignite and more than 5 secs to load this data from it. 
   And it looks like the most time is spent just to iterate though the given byte array in PrimitiveArray#from_python (see below)
   
   ```
   buffer = bytearray(header)
   
   for x in value:
        buffer += cls.primitive_type.from_python(x)
   return bytes(buffer)
   ```
   
   Is it the expected behavior? In case yes, what can be solution to put big binary data (more than 100MB ) into Ignite using thin python client?
   
   Also, I've noticed that instead of retrieving bytearray I'm getting a list of ints when calling `cache.get()` and some of the values are negative? Is it a defect?
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services