You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by Dren Butković <dr...@gmail.com> on 2023/03/14 10:07:31 UTC

pyignite - performance issue

Hi,

I made a speed comparison of retrieving data from Apache Ignite using
several methods. All records are in one table, I did not use any WHERE
condition, only a SELECT * FROM TABLE XYZ LIMIT 20000.

Test results are:
Apache Ignite

   - Apache Ignite REST API - 0.52 seconds
   - JDBC - 4 seconds
   - Python pyignite - 40 seconds !!!

pseudocode in Python using pyignite:

client = Client(username="ignite", password="pass", use_ssl=False)
client.connect('localhost', 10800)

cursor=client.sql('SELECT * FROM TABLE_XYZ LIMIT 20000')for row in cursor:
    pass

After that I made a speed comparison of retrieving data from PostgreSQL
using JDBC and psycopg2 Python package. SQL select is same, SELECT * FROM
TABLE XYZ LIMIT 20000
PostgreSQL

   - JDBC - 3 seconds
   - Python psycopg2 using fetchall - 3 seconds
   - Python psycopg2 using fetchone - 4 seconds

pseudocode in Python using psycopg2:

import psycopg2

conn = psycopg2.connect(database=DB_NAME,
            user=DB_USER,
            password=DB_PASS,
            host=DB_HOST,
            port=DB_PORT)

cur = conn.cursor()
cur.execute("SELECT * FROM TABLE_XYZ LIMIT 20000")
rows = cur.fetchall()for data in rows:
    pass

I can conclude that the pyignite implementation has much worse performance
compared to psycopg2 tests. The performance difference on PostgreSQL
between Java JDBC and Python psycopg2 is negligible.

The performance difference on Apache Ignite between Java JDBC and Python
pyignite is very big.

Please if someone can comment on the tests, did I do something wrong or are
these results expected? How can such large differences in execution times
be explained? Do you have any suggestions to get better results using
pyignite?

Thank you

Re: pyignite - performance issue

Posted by Ivan Daschinsky <iv...@gmail.com>.
It seems that epoll-shim can help to mitigate this issue without much code.

вт, 14 мар. 2023 г. в 21:11, Ivan Daschinsky <iv...@gmail.com>:

> Yep, it has been broken since the introduction of epoll in the network
> code. So kqueue support needs to be implemented.
>
> вт, 14 мар. 2023 г. в 19:22, Stephen Darlington <
> stephen.darlington@gridgain.com>:
>
>> Macs don’t have epoll, so it doesn’t compile currently.
>>
>> On 14 Mar 2023, at 16:02, Igor Sapego <is...@apache.org> wrote:
>>
>> Unfortunately, we do not have Mac agents, so we can not detect when
>> compilation on Mac OS is broken, so yeah...
>>
>> Best Regards,
>> Igor
>>
>>
>> On Tue, Mar 14, 2023 at 2:48 PM Ivan Daschinsky <iv...@gmail.com>
>> wrote:
>>
>>> An ignite odbc driver works well on linux and windows OSes, but it seems
>>> that it is impossible to compile it on Mac OS.
>>>
>>> вт, 14 мар. 2023 г. в 14:47, Ivan Daschinsky <iv...@gmail.com>:
>>>
>>>> Hi, Dren!
>>>>
>>>> Unfortunatelly, pyignite doesn't have an efficient native serialization
>>>> library, whereas psycopg2 has (it is a thin wrapper around libpq).
>>>>
>>>> I would suggest two options:
>>>> 1. Reduce a default batch size like this : `client.sql("SELECT * FROM
>>>> TABLE", page_size=10)`. Default 1024 seems too big and parsing of such a
>>>> big response seems to be really slow.
>>>> 2. Use ignite odbc driver and pyodbc over it. Both of them work pretty
>>>> well.
>>>>
>>>> вт, 14 мар. 2023 г. в 14:10, Dren Butković <dr...@gmail.com>:
>>>>
>>>>>
>>>>> Ignite and py client versions:
>>>>>
>>>>> - Apache Ignite 2.13.0
>>>>> - pyignite 0.5.2
>>>>>
>>>>> On Tue, Mar 14, 2023 at 11:46 AM Zhenya Stanilovsky via user <
>>>>> user@ignite.apache.org> wrote:
>>>>>
>>>>>> Hi, plz append ignite and py client versions.
>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I made a speed comparison of retrieving data from Apache Ignite using
>>>>>> several methods. All records are in one table, I did not use any WHERE
>>>>>> condition, only a SELECT * FROM TABLE XYZ LIMIT 20000.
>>>>>>
>>>>>> Test results are:
>>>>>> Apache Ignite
>>>>>>
>>>>>>    - Apache Ignite REST API - 0.52 seconds
>>>>>>    - JDBC - 4 seconds
>>>>>>    - Python pyignite - 40 seconds !!!
>>>>>>
>>>>>> pseudocode in Python using pyignite:
>>>>>>
>>>>>> client = Client(username="ignite", password="pass", use_ssl=False)
>>>>>> client.connect('localhost', 10800)
>>>>>>
>>>>>> cursor=client.sql('SELECT * FROM TABLE_XYZ LIMIT 20000')for row in cursor:
>>>>>>     pass
>>>>>>
>>>>>> After that I made a speed comparison of retrieving data from
>>>>>> PostgreSQL using JDBC and psycopg2 Python package. SQL select is same,
>>>>>> SELECT * FROM TABLE XYZ LIMIT 20000
>>>>>> PostgreSQL
>>>>>>
>>>>>>    - JDBC - 3 seconds
>>>>>>    - Python psycopg2 using fetchall - 3 seconds
>>>>>>    - Python psycopg2 using fetchone - 4 seconds
>>>>>>
>>>>>> pseudocode in Python using psycopg2:
>>>>>>
>>>>>> import psycopg2
>>>>>>
>>>>>> conn = psycopg2.connect(database=DB_NAME,
>>>>>>             user=DB_USER,
>>>>>>             password=DB_PASS,
>>>>>>             host=DB_HOST,
>>>>>>             port=DB_PORT)
>>>>>>
>>>>>> cur = conn.cursor()
>>>>>> cur.execute("SELECT * FROM TABLE_XYZ LIMIT 20000")
>>>>>> rows = cur.fetchall()for data in rows:
>>>>>>     pass
>>>>>>
>>>>>> I can conclude that the pyignite implementation has much worse
>>>>>> performance compared to psycopg2 tests. The performance difference on
>>>>>> PostgreSQL between Java JDBC and Python psycopg2 is negligible.
>>>>>>
>>>>>> The performance difference on Apache Ignite between Java JDBC and
>>>>>> Python pyignite is very big.
>>>>>>
>>>>>> Please if someone can comment on the tests, did I do something wrong
>>>>>> or are these results expected? How can such large differences in execution
>>>>>> times be explained? Do you have any suggestions to get better results using
>>>>>> pyignite?
>>>>>> Thank you
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Sincerely yours, Ivan Daschinskiy
>>>>
>>>
>>>
>>> --
>>> Sincerely yours, Ivan Daschinskiy
>>>
>>
>>
>
> --
> Sincerely yours, Ivan Daschinskiy
>


-- 
Sincerely yours, Ivan Daschinskiy

Re: pyignite - performance issue

Posted by Ivan Daschinsky <iv...@gmail.com>.
Yep, it has been broken since the introduction of epoll in the network
code. So kqueue support needs to be implemented.

вт, 14 мар. 2023 г. в 19:22, Stephen Darlington <
stephen.darlington@gridgain.com>:

> Macs don’t have epoll, so it doesn’t compile currently.
>
> On 14 Mar 2023, at 16:02, Igor Sapego <is...@apache.org> wrote:
>
> Unfortunately, we do not have Mac agents, so we can not detect when
> compilation on Mac OS is broken, so yeah...
>
> Best Regards,
> Igor
>
>
> On Tue, Mar 14, 2023 at 2:48 PM Ivan Daschinsky <iv...@gmail.com>
> wrote:
>
>> An ignite odbc driver works well on linux and windows OSes, but it seems
>> that it is impossible to compile it on Mac OS.
>>
>> вт, 14 мар. 2023 г. в 14:47, Ivan Daschinsky <iv...@gmail.com>:
>>
>>> Hi, Dren!
>>>
>>> Unfortunatelly, pyignite doesn't have an efficient native serialization
>>> library, whereas psycopg2 has (it is a thin wrapper around libpq).
>>>
>>> I would suggest two options:
>>> 1. Reduce a default batch size like this : `client.sql("SELECT * FROM
>>> TABLE", page_size=10)`. Default 1024 seems too big and parsing of such a
>>> big response seems to be really slow.
>>> 2. Use ignite odbc driver and pyodbc over it. Both of them work pretty
>>> well.
>>>
>>> вт, 14 мар. 2023 г. в 14:10, Dren Butković <dr...@gmail.com>:
>>>
>>>>
>>>> Ignite and py client versions:
>>>>
>>>> - Apache Ignite 2.13.0
>>>> - pyignite 0.5.2
>>>>
>>>> On Tue, Mar 14, 2023 at 11:46 AM Zhenya Stanilovsky via user <
>>>> user@ignite.apache.org> wrote:
>>>>
>>>>> Hi, plz append ignite and py client versions.
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I made a speed comparison of retrieving data from Apache Ignite using
>>>>> several methods. All records are in one table, I did not use any WHERE
>>>>> condition, only a SELECT * FROM TABLE XYZ LIMIT 20000.
>>>>>
>>>>> Test results are:
>>>>> Apache Ignite
>>>>>
>>>>>    - Apache Ignite REST API - 0.52 seconds
>>>>>    - JDBC - 4 seconds
>>>>>    - Python pyignite - 40 seconds !!!
>>>>>
>>>>> pseudocode in Python using pyignite:
>>>>>
>>>>> client = Client(username="ignite", password="pass", use_ssl=False)
>>>>> client.connect('localhost', 10800)
>>>>>
>>>>> cursor=client.sql('SELECT * FROM TABLE_XYZ LIMIT 20000')for row in cursor:
>>>>>     pass
>>>>>
>>>>> After that I made a speed comparison of retrieving data from
>>>>> PostgreSQL using JDBC and psycopg2 Python package. SQL select is same,
>>>>> SELECT * FROM TABLE XYZ LIMIT 20000
>>>>> PostgreSQL
>>>>>
>>>>>    - JDBC - 3 seconds
>>>>>    - Python psycopg2 using fetchall - 3 seconds
>>>>>    - Python psycopg2 using fetchone - 4 seconds
>>>>>
>>>>> pseudocode in Python using psycopg2:
>>>>>
>>>>> import psycopg2
>>>>>
>>>>> conn = psycopg2.connect(database=DB_NAME,
>>>>>             user=DB_USER,
>>>>>             password=DB_PASS,
>>>>>             host=DB_HOST,
>>>>>             port=DB_PORT)
>>>>>
>>>>> cur = conn.cursor()
>>>>> cur.execute("SELECT * FROM TABLE_XYZ LIMIT 20000")
>>>>> rows = cur.fetchall()for data in rows:
>>>>>     pass
>>>>>
>>>>> I can conclude that the pyignite implementation has much worse
>>>>> performance compared to psycopg2 tests. The performance difference on
>>>>> PostgreSQL between Java JDBC and Python psycopg2 is negligible.
>>>>>
>>>>> The performance difference on Apache Ignite between Java JDBC and
>>>>> Python pyignite is very big.
>>>>>
>>>>> Please if someone can comment on the tests, did I do something wrong
>>>>> or are these results expected? How can such large differences in execution
>>>>> times be explained? Do you have any suggestions to get better results using
>>>>> pyignite?
>>>>> Thank you
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>> --
>>> Sincerely yours, Ivan Daschinskiy
>>>
>>
>>
>> --
>> Sincerely yours, Ivan Daschinskiy
>>
>
>

-- 
Sincerely yours, Ivan Daschinskiy

Re: pyignite - performance issue

Posted by Stephen Darlington <st...@gridgain.com>.
Macs don’t have epoll, so it doesn’t compile currently.

> On 14 Mar 2023, at 16:02, Igor Sapego <is...@apache.org> wrote:
> 
> Unfortunately, we do not have Mac agents, so we can not detect when compilation on Mac OS is broken, so yeah...
> 
> Best Regards,
> Igor
> 
> 
> On Tue, Mar 14, 2023 at 2:48 PM Ivan Daschinsky <ivandasch@gmail.com <ma...@gmail.com>> wrote:
>> An ignite odbc driver works well on linux and windows OSes, but it seems that it is impossible to compile it on Mac OS.
>> 
>> вт, 14 мар. 2023 г. в 14:47, Ivan Daschinsky <ivandasch@gmail.com <ma...@gmail.com>>:
>>> Hi, Dren!
>>> 
>>> Unfortunatelly, pyignite doesn't have an efficient native serialization library, whereas psycopg2 has (it is a thin wrapper around libpq). 
>>> 
>>> I would suggest two options:
>>> 1. Reduce a default batch size like this : `client.sql("SELECT * FROM TABLE", page_size=10)`. Default 1024 seems too big and parsing of such a big response seems to be really slow.
>>> 2. Use ignite odbc driver and pyodbc over it. Both of them work pretty well.
>>> 
>>> вт, 14 мар. 2023 г. в 14:10, Dren Butković <dren.butkovic@gmail.com <ma...@gmail.com>>:
>>>> 
>>>> Ignite and py client versions:
>>>> 
>>>> - Apache Ignite 2.13.0
>>>> - pyignite 0.5.2
>>>> 
>>>> On Tue, Mar 14, 2023 at 11:46 AM Zhenya Stanilovsky via user <user@ignite.apache.org <ma...@ignite.apache.org>> wrote:
>>>>> Hi, plz append ignite and py client versions.
>>>>>  
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> I made a speed comparison of retrieving data from Apache Ignite using several methods. All records are in one table, I did not use any WHERE condition, only a SELECT * FROM TABLE XYZ LIMIT 20000.
>>>>> 
>>>>> Test results are:
>>>>> 
>>>>> Apache Ignite
>>>>> Apache Ignite REST API - 0.52 seconds
>>>>> JDBC - 4 seconds
>>>>> Python pyignite - 40 seconds !!!
>>>>> pseudocode in Python using pyignite:
>>>>> 
>>>>> client = Client(username="ignite", password="pass", use_ssl=False)
>>>>> client.connect('localhost', 10800)
>>>>> 
>>>>> cursor=client.sql('SELECT * FROM TABLE_XYZ LIMIT 20000')
>>>>> for row in cursor:
>>>>>     pass
>>>>> After that I made a speed comparison of retrieving data from PostgreSQL using JDBC and psycopg2 Python package. SQL select is same, SELECT * FROM TABLE XYZ LIMIT 20000
>>>>> 
>>>>> PostgreSQL
>>>>> JDBC - 3 seconds
>>>>> Python psycopg2 using fetchall - 3 seconds
>>>>> Python psycopg2 using fetchone - 4 seconds
>>>>> pseudocode in Python using psycopg2:
>>>>> 
>>>>> import psycopg2
>>>>> 
>>>>> conn = psycopg2.connect(database=DB_NAME,
>>>>>             user=DB_USER,
>>>>>             password=DB_PASS,
>>>>>             host=DB_HOST,
>>>>>             port=DB_PORT)
>>>>> 
>>>>> cur = conn.cursor()
>>>>> cur.execute("SELECT * FROM TABLE_XYZ LIMIT 20000")
>>>>> rows = cur.fetchall()
>>>>> for data in rows:
>>>>>     pass
>>>>> I can conclude that the pyignite implementation has much worse performance compared to psycopg2 tests. The performance difference on PostgreSQL between Java JDBC and Python psycopg2 is negligible. 
>>>>> 
>>>>> The performance difference on Apache Ignite between Java JDBC and Python pyignite is very big.
>>>>> 
>>>>> Please if someone can comment on the tests, did I do something wrong or are these results expected? How can such large differences in execution times be explained? Do you have any suggestions to get better results using pyignite?
>>>>> 
>>>>> Thank you
>>>>>  
>>>>>  
>>>>>  
>>>>>  
>>> 
>>> 
>>> -- 
>>> Sincerely yours, Ivan Daschinskiy
>> 
>> 
>> -- 
>> Sincerely yours, Ivan Daschinskiy


Re: pyignite - performance issue

Posted by Igor Sapego <is...@apache.org>.
Unfortunately, we do not have Mac agents, so we can not detect when
compilation on Mac OS is broken, so yeah...

Best Regards,
Igor


On Tue, Mar 14, 2023 at 2:48 PM Ivan Daschinsky <iv...@gmail.com> wrote:

> An ignite odbc driver works well on linux and windows OSes, but it seems
> that it is impossible to compile it on Mac OS.
>
> вт, 14 мар. 2023 г. в 14:47, Ivan Daschinsky <iv...@gmail.com>:
>
>> Hi, Dren!
>>
>> Unfortunatelly, pyignite doesn't have an efficient native serialization
>> library, whereas psycopg2 has (it is a thin wrapper around libpq).
>>
>> I would suggest two options:
>> 1. Reduce a default batch size like this : `client.sql("SELECT * FROM
>> TABLE", page_size=10)`. Default 1024 seems too big and parsing of such a
>> big response seems to be really slow.
>> 2. Use ignite odbc driver and pyodbc over it. Both of them work pretty
>> well.
>>
>> вт, 14 мар. 2023 г. в 14:10, Dren Butković <dr...@gmail.com>:
>>
>>>
>>> Ignite and py client versions:
>>>
>>> - Apache Ignite 2.13.0
>>> - pyignite 0.5.2
>>>
>>> On Tue, Mar 14, 2023 at 11:46 AM Zhenya Stanilovsky via user <
>>> user@ignite.apache.org> wrote:
>>>
>>>> Hi, plz append ignite and py client versions.
>>>>
>>>>
>>>> Hi,
>>>>
>>>> I made a speed comparison of retrieving data from Apache Ignite using
>>>> several methods. All records are in one table, I did not use any WHERE
>>>> condition, only a SELECT * FROM TABLE XYZ LIMIT 20000.
>>>>
>>>> Test results are:
>>>> Apache Ignite
>>>>
>>>>    - Apache Ignite REST API - 0.52 seconds
>>>>    - JDBC - 4 seconds
>>>>    - Python pyignite - 40 seconds !!!
>>>>
>>>> pseudocode in Python using pyignite:
>>>>
>>>> client = Client(username="ignite", password="pass", use_ssl=False)
>>>> client.connect('localhost', 10800)
>>>>
>>>> cursor=client.sql('SELECT * FROM TABLE_XYZ LIMIT 20000')for row in cursor:
>>>>     pass
>>>>
>>>> After that I made a speed comparison of retrieving data from PostgreSQL
>>>> using JDBC and psycopg2 Python package. SQL select is same, SELECT * FROM
>>>> TABLE XYZ LIMIT 20000
>>>> PostgreSQL
>>>>
>>>>    - JDBC - 3 seconds
>>>>    - Python psycopg2 using fetchall - 3 seconds
>>>>    - Python psycopg2 using fetchone - 4 seconds
>>>>
>>>> pseudocode in Python using psycopg2:
>>>>
>>>> import psycopg2
>>>>
>>>> conn = psycopg2.connect(database=DB_NAME,
>>>>             user=DB_USER,
>>>>             password=DB_PASS,
>>>>             host=DB_HOST,
>>>>             port=DB_PORT)
>>>>
>>>> cur = conn.cursor()
>>>> cur.execute("SELECT * FROM TABLE_XYZ LIMIT 20000")
>>>> rows = cur.fetchall()for data in rows:
>>>>     pass
>>>>
>>>> I can conclude that the pyignite implementation has much worse
>>>> performance compared to psycopg2 tests. The performance difference on
>>>> PostgreSQL between Java JDBC and Python psycopg2 is negligible.
>>>>
>>>> The performance difference on Apache Ignite between Java JDBC and
>>>> Python pyignite is very big.
>>>>
>>>> Please if someone can comment on the tests, did I do something wrong or
>>>> are these results expected? How can such large differences in execution
>>>> times be explained? Do you have any suggestions to get better results using
>>>> pyignite?
>>>>
>>>> Thank you
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>> --
>> Sincerely yours, Ivan Daschinskiy
>>
>
>
> --
> Sincerely yours, Ivan Daschinskiy
>

Re: pyignite - performance issue

Posted by Ivan Daschinsky <iv...@gmail.com>.
An ignite odbc driver works well on linux and windows OSes, but it seems
that it is impossible to compile it on Mac OS.

вт, 14 мар. 2023 г. в 14:47, Ivan Daschinsky <iv...@gmail.com>:

> Hi, Dren!
>
> Unfortunatelly, pyignite doesn't have an efficient native serialization
> library, whereas psycopg2 has (it is a thin wrapper around libpq).
>
> I would suggest two options:
> 1. Reduce a default batch size like this : `client.sql("SELECT * FROM
> TABLE", page_size=10)`. Default 1024 seems too big and parsing of such a
> big response seems to be really slow.
> 2. Use ignite odbc driver and pyodbc over it. Both of them work pretty
> well.
>
> вт, 14 мар. 2023 г. в 14:10, Dren Butković <dr...@gmail.com>:
>
>>
>> Ignite and py client versions:
>>
>> - Apache Ignite 2.13.0
>> - pyignite 0.5.2
>>
>> On Tue, Mar 14, 2023 at 11:46 AM Zhenya Stanilovsky via user <
>> user@ignite.apache.org> wrote:
>>
>>> Hi, plz append ignite and py client versions.
>>>
>>>
>>> Hi,
>>>
>>> I made a speed comparison of retrieving data from Apache Ignite using
>>> several methods. All records are in one table, I did not use any WHERE
>>> condition, only a SELECT * FROM TABLE XYZ LIMIT 20000.
>>>
>>> Test results are:
>>> Apache Ignite
>>>
>>>    - Apache Ignite REST API - 0.52 seconds
>>>    - JDBC - 4 seconds
>>>    - Python pyignite - 40 seconds !!!
>>>
>>> pseudocode in Python using pyignite:
>>>
>>> client = Client(username="ignite", password="pass", use_ssl=False)
>>> client.connect('localhost', 10800)
>>>
>>> cursor=client.sql('SELECT * FROM TABLE_XYZ LIMIT 20000')for row in cursor:
>>>     pass
>>>
>>> After that I made a speed comparison of retrieving data from PostgreSQL
>>> using JDBC and psycopg2 Python package. SQL select is same, SELECT * FROM
>>> TABLE XYZ LIMIT 20000
>>> PostgreSQL
>>>
>>>    - JDBC - 3 seconds
>>>    - Python psycopg2 using fetchall - 3 seconds
>>>    - Python psycopg2 using fetchone - 4 seconds
>>>
>>> pseudocode in Python using psycopg2:
>>>
>>> import psycopg2
>>>
>>> conn = psycopg2.connect(database=DB_NAME,
>>>             user=DB_USER,
>>>             password=DB_PASS,
>>>             host=DB_HOST,
>>>             port=DB_PORT)
>>>
>>> cur = conn.cursor()
>>> cur.execute("SELECT * FROM TABLE_XYZ LIMIT 20000")
>>> rows = cur.fetchall()for data in rows:
>>>     pass
>>>
>>> I can conclude that the pyignite implementation has much worse
>>> performance compared to psycopg2 tests. The performance difference on
>>> PostgreSQL between Java JDBC and Python psycopg2 is negligible.
>>>
>>> The performance difference on Apache Ignite between Java JDBC and Python
>>> pyignite is very big.
>>>
>>> Please if someone can comment on the tests, did I do something wrong or
>>> are these results expected? How can such large differences in execution
>>> times be explained? Do you have any suggestions to get better results using
>>> pyignite?
>>>
>>> Thank you
>>>
>>>
>>>
>>>
>>>
>>>
>>
>
> --
> Sincerely yours, Ivan Daschinskiy
>


-- 
Sincerely yours, Ivan Daschinskiy

Re: pyignite - performance issue

Posted by Dren Butković <dr...@gmail.com>.
Hi Ivan,
Thank you for reply.
I made a mistake on the first test when I used the REST API.
The host from which I ran the tests and the network bandwidth were not the
same.
I repeated the test and the results I got using Ignite REST API and JDBC
are almost the same.

REST API - 1.7 seconds
JDBC - 2 seconds

Regarding ODBC and pyodbc, I made a new test and SELECT * FROM TABLE LIMIT
20000 *is also around two seconds. *

Thank you very much for your reply, it helped me a lot.

Best regards
Dren

On Tue, Mar 14, 2023 at 12:47 PM Ivan Daschinsky <iv...@gmail.com>
wrote:

> Hi, Dren!
>
> Unfortunatelly, pyignite doesn't have an efficient native serialization
> library, whereas psycopg2 has (it is a thin wrapper around libpq).
>
> I would suggest two options:
> 1. Reduce a default batch size like this : `client.sql("SELECT * FROM
> TABLE", page_size=10)`. Default 1024 seems too big and parsing of such a
> big response seems to be really slow.
> 2. Use ignite odbc driver and pyodbc over it. Both of them work pretty
> well.
>
> вт, 14 мар. 2023 г. в 14:10, Dren Butković <dr...@gmail.com>:
>
>>
>> Ignite and py client versions:
>>
>> - Apache Ignite 2.13.0
>> - pyignite 0.5.2
>>
>> On Tue, Mar 14, 2023 at 11:46 AM Zhenya Stanilovsky via user <
>> user@ignite.apache.org> wrote:
>>
>>> Hi, plz append ignite and py client versions.
>>>
>>>
>>> Hi,
>>>
>>> I made a speed comparison of retrieving data from Apache Ignite using
>>> several methods. All records are in one table, I did not use any WHERE
>>> condition, only a SELECT * FROM TABLE XYZ LIMIT 20000.
>>>
>>> Test results are:
>>> Apache Ignite
>>>
>>>    - Apache Ignite REST API - 0.52 seconds
>>>    - JDBC - 4 seconds
>>>    - Python pyignite - 40 seconds !!!
>>>
>>> pseudocode in Python using pyignite:
>>>
>>> client = Client(username="ignite", password="pass", use_ssl=False)
>>> client.connect('localhost', 10800)
>>>
>>> cursor=client.sql('SELECT * FROM TABLE_XYZ LIMIT 20000')for row in cursor:
>>>     pass
>>>
>>> After that I made a speed comparison of retrieving data from PostgreSQL
>>> using JDBC and psycopg2 Python package. SQL select is same, SELECT * FROM
>>> TABLE XYZ LIMIT 20000
>>> PostgreSQL
>>>
>>>    - JDBC - 3 seconds
>>>    - Python psycopg2 using fetchall - 3 seconds
>>>    - Python psycopg2 using fetchone - 4 seconds
>>>
>>> pseudocode in Python using psycopg2:
>>>
>>> import psycopg2
>>>
>>> conn = psycopg2.connect(database=DB_NAME,
>>>             user=DB_USER,
>>>             password=DB_PASS,
>>>             host=DB_HOST,
>>>             port=DB_PORT)
>>>
>>> cur = conn.cursor()
>>> cur.execute("SELECT * FROM TABLE_XYZ LIMIT 20000")
>>> rows = cur.fetchall()for data in rows:
>>>     pass
>>>
>>> I can conclude that the pyignite implementation has much worse
>>> performance compared to psycopg2 tests. The performance difference on
>>> PostgreSQL between Java JDBC and Python psycopg2 is negligible.
>>>
>>> The performance difference on Apache Ignite between Java JDBC and Python
>>> pyignite is very big.
>>>
>>> Please if someone can comment on the tests, did I do something wrong or
>>> are these results expected? How can such large differences in execution
>>> times be explained? Do you have any suggestions to get better results using
>>> pyignite?
>>>
>>> Thank you
>>>
>>>
>>>
>>>
>>>
>>>
>>
>
> --
> Sincerely yours, Ivan Daschinskiy
>

Re: pyignite - performance issue

Posted by Ivan Daschinsky <iv...@gmail.com>.
Hi, Dren!

Unfortunatelly, pyignite doesn't have an efficient native serialization
library, whereas psycopg2 has (it is a thin wrapper around libpq).

I would suggest two options:
1. Reduce a default batch size like this : `client.sql("SELECT * FROM
TABLE", page_size=10)`. Default 1024 seems too big and parsing of such a
big response seems to be really slow.
2. Use ignite odbc driver and pyodbc over it. Both of them work pretty well.

вт, 14 мар. 2023 г. в 14:10, Dren Butković <dr...@gmail.com>:

>
> Ignite and py client versions:
>
> - Apache Ignite 2.13.0
> - pyignite 0.5.2
>
> On Tue, Mar 14, 2023 at 11:46 AM Zhenya Stanilovsky via user <
> user@ignite.apache.org> wrote:
>
>> Hi, plz append ignite and py client versions.
>>
>>
>> Hi,
>>
>> I made a speed comparison of retrieving data from Apache Ignite using
>> several methods. All records are in one table, I did not use any WHERE
>> condition, only a SELECT * FROM TABLE XYZ LIMIT 20000.
>>
>> Test results are:
>> Apache Ignite
>>
>>    - Apache Ignite REST API - 0.52 seconds
>>    - JDBC - 4 seconds
>>    - Python pyignite - 40 seconds !!!
>>
>> pseudocode in Python using pyignite:
>>
>> client = Client(username="ignite", password="pass", use_ssl=False)
>> client.connect('localhost', 10800)
>>
>> cursor=client.sql('SELECT * FROM TABLE_XYZ LIMIT 20000')for row in cursor:
>>     pass
>>
>> After that I made a speed comparison of retrieving data from PostgreSQL
>> using JDBC and psycopg2 Python package. SQL select is same, SELECT * FROM
>> TABLE XYZ LIMIT 20000
>> PostgreSQL
>>
>>    - JDBC - 3 seconds
>>    - Python psycopg2 using fetchall - 3 seconds
>>    - Python psycopg2 using fetchone - 4 seconds
>>
>> pseudocode in Python using psycopg2:
>>
>> import psycopg2
>>
>> conn = psycopg2.connect(database=DB_NAME,
>>             user=DB_USER,
>>             password=DB_PASS,
>>             host=DB_HOST,
>>             port=DB_PORT)
>>
>> cur = conn.cursor()
>> cur.execute("SELECT * FROM TABLE_XYZ LIMIT 20000")
>> rows = cur.fetchall()for data in rows:
>>     pass
>>
>> I can conclude that the pyignite implementation has much worse
>> performance compared to psycopg2 tests. The performance difference on
>> PostgreSQL between Java JDBC and Python psycopg2 is negligible.
>>
>> The performance difference on Apache Ignite between Java JDBC and Python
>> pyignite is very big.
>>
>> Please if someone can comment on the tests, did I do something wrong or
>> are these results expected? How can such large differences in execution
>> times be explained? Do you have any suggestions to get better results using
>> pyignite?
>>
>> Thank you
>>
>>
>>
>>
>>
>>
>

-- 
Sincerely yours, Ivan Daschinskiy

Re: pyignite - performance issue

Posted by Dren Butković <dr...@gmail.com>.
Ignite and py client versions:

- Apache Ignite 2.13.0
- pyignite 0.5.2

On Tue, Mar 14, 2023 at 11:46 AM Zhenya Stanilovsky via user <
user@ignite.apache.org> wrote:

> Hi, plz append ignite and py client versions.
>
>
> Hi,
>
> I made a speed comparison of retrieving data from Apache Ignite using
> several methods. All records are in one table, I did not use any WHERE
> condition, only a SELECT * FROM TABLE XYZ LIMIT 20000.
>
> Test results are:
> Apache Ignite
>
>    - Apache Ignite REST API - 0.52 seconds
>    - JDBC - 4 seconds
>    - Python pyignite - 40 seconds !!!
>
> pseudocode in Python using pyignite:
>
> client = Client(username="ignite", password="pass", use_ssl=False)
> client.connect('localhost', 10800)
>
> cursor=client.sql('SELECT * FROM TABLE_XYZ LIMIT 20000')for row in cursor:
>     pass
>
> After that I made a speed comparison of retrieving data from PostgreSQL
> using JDBC and psycopg2 Python package. SQL select is same, SELECT * FROM
> TABLE XYZ LIMIT 20000
> PostgreSQL
>
>    - JDBC - 3 seconds
>    - Python psycopg2 using fetchall - 3 seconds
>    - Python psycopg2 using fetchone - 4 seconds
>
> pseudocode in Python using psycopg2:
>
> import psycopg2
>
> conn = psycopg2.connect(database=DB_NAME,
>             user=DB_USER,
>             password=DB_PASS,
>             host=DB_HOST,
>             port=DB_PORT)
>
> cur = conn.cursor()
> cur.execute("SELECT * FROM TABLE_XYZ LIMIT 20000")
> rows = cur.fetchall()for data in rows:
>     pass
>
> I can conclude that the pyignite implementation has much worse performance
> compared to psycopg2 tests. The performance difference on PostgreSQL
> between Java JDBC and Python psycopg2 is negligible.
>
> The performance difference on Apache Ignite between Java JDBC and Python
> pyignite is very big.
>
> Please if someone can comment on the tests, did I do something wrong or
> are these results expected? How can such large differences in execution
> times be explained? Do you have any suggestions to get better results using
> pyignite?
>
> Thank you
>
>
>
>
>
>

Re: pyignite - performance issue

Posted by Zhenya Stanilovsky via user <us...@ignite.apache.org>.
Hi, plz append ignite and py client versions.
 
>Hi,
>I made a speed comparison of retrieving data from Apache Ignite using several methods. All records are in one table, I did not use any WHERE condition, only a SELECT * FROM TABLE XYZ LIMIT 20000.
>Test results are:
>Apache Ignite
>*  Apache Ignite REST API - 0.52 seconds
>*  JDBC - 4 seconds
>*  Python pyignite - 40 seconds !!!
>pseudocode in Python using pyignite:
>client = Client(username="ignite", password="pass", use_ssl=False)
>client.connect('localhost', 10800)
>
>cursor=client.sql('SELECT * FROM TABLE_XYZ LIMIT 20000')
>for row in cursor:
>    pass
>
>After that I made a speed comparison of retrieving data from PostgreSQL using JDBC and psycopg2 Python package. SQL select is same, SELECT * FROM TABLE XYZ LIMIT 20000
>PostgreSQL
>*  JDBC - 3 seconds
>*  Python psycopg2 using fetchall - 3 seconds
>*  Python psycopg2 using fetchone - 4 seconds
>pseudocode in Python using psycopg2:
>import psycopg2
>
>conn = psycopg2.connect(database=DB_NAME,
>            user=DB_USER,
>            password=DB_PASS,
>            host=DB_HOST,
>            port=DB_PORT)
>
>cur = conn.cursor()
>cur.execute("SELECT * FROM TABLE_XYZ LIMIT 20000")
>rows = cur.fetchall()
>for data in rows:
>    pass
>
>I can conclude that the pyignite implementation has much worse performance compared to psycopg2 tests. The performance difference on PostgreSQL between Java JDBC and Python psycopg2 is negligible. 
>The performance difference on Apache Ignite between Java JDBC and Python pyignite is very big.
>Please if someone can comment on the tests, did I do something wrong or are these results expected? How can such large differences in execution times be explained? Do you have any suggestions to get better results using pyignite?
>Thank you