You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by arthi <Ar...@nielsen.com> on 2016/04/04 16:23:44 UTC

Re: Host Arrays in C++ API

Hi Igor,

Is this something we can expect as a feature in the near future? Please let
us know if this is tracked as an enhancement / jira issue.

Thanks for your help,
Arthi



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Host-Arrays-in-C-API-tp3707p3902.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Host Arrays in C++ API

Posted by Sergi Vladykin <se...@gmail.com>.
Hi,

In CacheConfiguration you can set setLongQueryWarningTimeout property to 0
and see in logs how long does it take to execute each query part.

Most probably this slowdown is because of restriction in our SQL engine,
which has to generate the whole result set before starting to send it back
to the client. Usually it is not a problem, but in your case it is a ~1M
rows which is a fairly big number. You can try to address this limitation
by starting more nodes and making the query execution process more parallel
(instead of 4 nodes you can try to start 20, may be multiple nodes on each
box).

Sergi




2016-04-13 17:48 GMT+03:00 Igor Sapego <is...@gridgain.com>:

> Sergi,
>
> Can you please take a look?
>
> Here is the query:
> SELECT sid_per_id, sid_mah_id, sid_itm_id, sid_prm_id, sid_cha_code,
> sid_service, sid_itm_dist
> FROM SHOP_ITEM_BITMAP_CACHE.ShopItemBitmap as bitmap
>     JOIN PROD_HIER_CACHE.ProdHier as prod ON prod.GHD_LEV_6 =
> bitmap.SID_ITM_ID AND prod.GHD_HIE_ID = 0 AND prod.ghd_lev_2_desc = 'WINE'
>     JOIN MRKT_POD_ACV_CACHE.MarketPodAcv as mkt ON mkt.CHLD_MRKT_KEY =
> bitmap.SID_MAH_ID and mkt.MRKT_KEY = 4378038 AND mkt.prd_id = 712
> WHERE bitmap.SID_PER_ID = 712;
>
> Here is the explain output:
> [[SELECT DISTINCT
>     SID_PER_ID AS __C0,
>     SID_MAH_ID AS __C1,
>     SID_ITM_ID AS __C2,
>     SID_PRM_ID AS __C3,
>     SID_CHA_CODE AS __C4,
>     SID_SERVICE AS __C5,
>     SID_ITM_DIST AS __C6
> FROM PROD_HIER_CACHE.PRODHIER PROD
>     /* PROD_HIER_CACHE."ghd_hie_id_asc_ghd_lev_2_desc_asc_idx": GHD_HIE_ID
> = 0
>         AND GHD_LEV_2_DESC = 'WINE'
>      */
>     /* WHERE (PROD.GHD_HIE_ID = 0)
>         AND (PROD.GHD_LEV_2_DESC = 'WINE')
>     */
> INNER JOIN SHOP_ITEM_BITMAP_CACHE.SHOPITEMBITMAP
>     /* SHOP_ITEM_BITMAP_CACHE."sid_itm_id_asc_idx": SID_ITM_ID =
> PROD.GHD_LEV_6 */
>     ON 1=1
>     /* WHERE (SHOPITEMBITMAP.SID_PER_ID = 712)
>         AND (PROD.GHD_LEV_6 = SHOPITEMBITMAP.SID_ITM_ID)
>     */
> INNER JOIN MARKET_POD_ACV_CACHE.MARKETPODACV MKT
>     /*
> MARKET_POD_ACV_CACHE."mrkt_key_asc_prd_id_asc_chld_mrkt_key_asc_idx":
> PRD_ID = 712
>         AND MRKT_KEY = 4378038
>         AND CHLD_MRKT_KEY = SHOPITEMBITMAP.SID_MAH_ID
>      */
>     ON 1=1
> WHERE (SHOPITEMBITMAP.SID_PER_ID = 712)
>     AND (((MKT.PRD_ID = 712)
>     AND ((MKT.MRKT_KEY = 4378038)
>     AND (MKT.CHLD_MRKT_KEY = SHOPITEMBITMAP.SID_MAH_ID)))
>     AND ((PROD.GHD_LEV_6 = SHOPITEMBITMAP.SID_ITM_ID)
>     AND ((PROD.GHD_HIE_ID = 0)
>     AND (PROD.GHD_LEV_2_DESC = 'WINE'))))], [SELECT DISTINCT
>     __C0 AS SID_PER_ID,
>     __C1 AS SID_MAH_ID,
>     __C2 AS SID_ITM_ID,
>     __C3 AS SID_PRM_ID,
>     __C4 AS SID_CHA_CODE,
>     __C5 AS SID_SERVICE,
>     __C6 AS SID_ITM_DIST
> FROM PUBLIC.__T0
>     /* SHOP_ITEM_BITMAP_CACHE."merge_scan" */]]
>
> The first thing I could see is that there is no index for "sid_per_id" and
> the second thing is that "merge_scan" is being used.
> What is your thoughts?
>
>
> Best Regards,
> Igor
>
> On Tue, Apr 12, 2016 at 2:11 PM, Igor Sapego <is...@gridgain.com> wrote:
>
>> Arthi,
>>
>> How much time does it take? What is the speed of fetching in rows
>> per second?
>>
>> Guys, can someone who is more familiar with Ignite's SQL queries
>> take a look?
>>
>> Best Regards,
>> Igor
>>
>> On Mon, Apr 11, 2016 at 5:17 PM, arthi <
>> Arthi.Kasturirangan.ap@nielsen.com> wrote:
>>
>>> Hi Igor,
>>>
>>> the SQL query for getNext() takes a whole lot of time when the results
>>> returned is more than 900,000 both using Java and C++ API. I have tried a
>>> number of configurations with indexes, but just dont get to speed these
>>> up.
>>>
>>> thanks,
>>> Arthi
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-ignite-users.70518.x6.nabble.com/Host-Arrays-in-C-API-tp3707p4064.html
>>> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>>>
>>
>>
>

Re: Host Arrays in C++ API

Posted by Igor Sapego <is...@gridgain.com>.
Sergi,

Can you please take a look?

Here is the query:
SELECT sid_per_id, sid_mah_id, sid_itm_id, sid_prm_id, sid_cha_code,
sid_service, sid_itm_dist
FROM SHOP_ITEM_BITMAP_CACHE.ShopItemBitmap as bitmap
    JOIN PROD_HIER_CACHE.ProdHier as prod ON prod.GHD_LEV_6 =
bitmap.SID_ITM_ID AND prod.GHD_HIE_ID = 0 AND prod.ghd_lev_2_desc = 'WINE'
    JOIN MRKT_POD_ACV_CACHE.MarketPodAcv as mkt ON mkt.CHLD_MRKT_KEY =
bitmap.SID_MAH_ID and mkt.MRKT_KEY = 4378038 AND mkt.prd_id = 712
WHERE bitmap.SID_PER_ID = 712;

Here is the explain output:
[[SELECT DISTINCT
    SID_PER_ID AS __C0,
    SID_MAH_ID AS __C1,
    SID_ITM_ID AS __C2,
    SID_PRM_ID AS __C3,
    SID_CHA_CODE AS __C4,
    SID_SERVICE AS __C5,
    SID_ITM_DIST AS __C6
FROM PROD_HIER_CACHE.PRODHIER PROD
    /* PROD_HIER_CACHE."ghd_hie_id_asc_ghd_lev_2_desc_asc_idx": GHD_HIE_ID
= 0
        AND GHD_LEV_2_DESC = 'WINE'
     */
    /* WHERE (PROD.GHD_HIE_ID = 0)
        AND (PROD.GHD_LEV_2_DESC = 'WINE')
    */
INNER JOIN SHOP_ITEM_BITMAP_CACHE.SHOPITEMBITMAP
    /* SHOP_ITEM_BITMAP_CACHE."sid_itm_id_asc_idx": SID_ITM_ID =
PROD.GHD_LEV_6 */
    ON 1=1
    /* WHERE (SHOPITEMBITMAP.SID_PER_ID = 712)
        AND (PROD.GHD_LEV_6 = SHOPITEMBITMAP.SID_ITM_ID)
    */
INNER JOIN MARKET_POD_ACV_CACHE.MARKETPODACV MKT
    /*
MARKET_POD_ACV_CACHE."mrkt_key_asc_prd_id_asc_chld_mrkt_key_asc_idx":
PRD_ID = 712
        AND MRKT_KEY = 4378038
        AND CHLD_MRKT_KEY = SHOPITEMBITMAP.SID_MAH_ID
     */
    ON 1=1
WHERE (SHOPITEMBITMAP.SID_PER_ID = 712)
    AND (((MKT.PRD_ID = 712)
    AND ((MKT.MRKT_KEY = 4378038)
    AND (MKT.CHLD_MRKT_KEY = SHOPITEMBITMAP.SID_MAH_ID)))
    AND ((PROD.GHD_LEV_6 = SHOPITEMBITMAP.SID_ITM_ID)
    AND ((PROD.GHD_HIE_ID = 0)
    AND (PROD.GHD_LEV_2_DESC = 'WINE'))))], [SELECT DISTINCT
    __C0 AS SID_PER_ID,
    __C1 AS SID_MAH_ID,
    __C2 AS SID_ITM_ID,
    __C3 AS SID_PRM_ID,
    __C4 AS SID_CHA_CODE,
    __C5 AS SID_SERVICE,
    __C6 AS SID_ITM_DIST
FROM PUBLIC.__T0
    /* SHOP_ITEM_BITMAP_CACHE."merge_scan" */]]

The first thing I could see is that there is no index for "sid_per_id" and
the second thing is that "merge_scan" is being used.
What is your thoughts?


Best Regards,
Igor

On Tue, Apr 12, 2016 at 2:11 PM, Igor Sapego <is...@gridgain.com> wrote:

> Arthi,
>
> How much time does it take? What is the speed of fetching in rows
> per second?
>
> Guys, can someone who is more familiar with Ignite's SQL queries
> take a look?
>
> Best Regards,
> Igor
>
> On Mon, Apr 11, 2016 at 5:17 PM, arthi <Arthi.Kasturirangan.ap@nielsen.com
> > wrote:
>
>> Hi Igor,
>>
>> the SQL query for getNext() takes a whole lot of time when the results
>> returned is more than 900,000 both using Java and C++ API. I have tried a
>> number of configurations with indexes, but just dont get to speed these
>> up.
>>
>> thanks,
>> Arthi
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-ignite-users.70518.x6.nabble.com/Host-Arrays-in-C-API-tp3707p4064.html
>> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>>
>
>

Re: Host Arrays in C++ API

Posted by Igor Sapego <is...@gridgain.com>.
Arthi,

How much time does it take? What is the speed of fetching in rows
per second?

Guys, can someone who is more familiar with Ignite's SQL queries
take a look?

Best Regards,
Igor

On Mon, Apr 11, 2016 at 5:17 PM, arthi <Ar...@nielsen.com>
wrote:

> Hi Igor,
>
> the SQL query for getNext() takes a whole lot of time when the results
> returned is more than 900,000 both using Java and C++ API. I have tried a
> number of configurations with indexes, but just dont get to speed these up.
>
> thanks,
> Arthi
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/Host-Arrays-in-C-API-tp3707p4064.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: Host Arrays in C++ API

Posted by arthi <Ar...@nielsen.com>.
Hi Igor,

the SQL query for getNext() takes a whole lot of time when the results
returned is more than 900,000 both using Java and C++ API. I have tried a
number of configurations with indexes, but just dont get to speed these up.

thanks,
Arthi



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Host-Arrays-in-C-API-tp3707p4064.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Host Arrays in C++ API

Posted by Igor Sapego <is...@gridgain.com>.
Arthi,

I've run some profiling over the code that you have provided and it seems
like there is almost no time spent in the client (C++) code. So I believe
host arrays would not help you much as your code spends the most of the
execution time on the query itself and not on the C++ calls.


Best Regards,
Igor

On Tue, Apr 5, 2016 at 8:29 AM, arthi <Ar...@nielsen.com>
wrote:

> Thanks Igor
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/Host-Arrays-in-C-API-tp3707p3920.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: Host Arrays in C++ API

Posted by arthi <Ar...@nielsen.com>.
Thanks Igor



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Host-Arrays-in-C-API-tp3707p3920.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Host Arrays in C++ API

Posted by Igor Sapego <is...@gridgain.com>.
Arthi,

It is postponed for now and is not going to be included in 1.6 and
no, it is not in the master right now.

But as I've said before profiling showed that this was not a bottleneck
in your case and this patch would not fasten a query very much in your
case as the majority of the CPU time takes execution of the query itself,
not the C++ client code.


Best Regards,
Igor

On Fri, May 20, 2016 at 8:57 AM, arthi <Ar...@nielsen.com>
wrote:

> Hi Igor & Val,
>
> We are tracking this issue -
> https://issues.apache.org/jira/browse/IGNITE-2946.
>
> Is there a tested patch available for this fix.
> I saw a pull request - https://github.com/apache/ignite/pull/649
> I this in the master now?
>
> the jira says the fix version is 1.7. When is that planned to happen?
>
> Thanks,
> Arthi
>
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/Host-Arrays-in-C-API-tp3707p5044.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: Host Arrays in C++ API

Posted by arthi <Ar...@nielsen.com>.
Hi Igor & Val,

We are tracking this issue -
https://issues.apache.org/jira/browse/IGNITE-2946.

Is there a tested patch available for this fix. 
I saw a pull request - https://github.com/apache/ignite/pull/649
I this in the master now? 

the jira says the fix version is 1.7. When is that planned to happen? 

Thanks,
Arthi




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Host-Arrays-in-C-API-tp3707p5044.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Host Arrays in C++ API

Posted by Igor Sapego <is...@gridgain.com>.
Hi Arthi,

Host arrays is not something that we have planned for the nearest
release. However, I have reproduced and investigated the issue
that you have provided and it is very likely that we are going to
optimize cursor operations in the nearest future which should
fasten your code a lot. You can track the corresponding ticket [1].

[1] - https://issues.apache.org/jira/browse/IGNITE-2946.


Best Regards,
Igor

On Mon, Apr 4, 2016 at 5:23 PM, arthi <Ar...@nielsen.com>
wrote:

> Hi Igor,
>
> Is this something we can expect as a feature in the near future? Please let
> us know if this is tracked as an enhancement / jira issue.
>
> Thanks for your help,
> Arthi
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/Host-Arrays-in-C-API-tp3707p3902.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>