You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "wesleydeng (Jira)" <ji...@apache.org> on 2020/11/12 12:55:00 UTC

[jira] [Updated] (IMPALA-10323) use snprintf instead of lexical_cast when casting int to string, to improve multi-thread performance

     [ https://issues.apache.org/jira/browse/IMPALA-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

wesleydeng updated IMPALA-10323:
--------------------------------
    Description: 
For float_type, lexical_cast is replace by snprintf (issue https://issues.apache.org/jira/browse/IMPALA-1738),but why not do the same replacement for num_type.

Test is done in 2 sql case :

1)  *group by cast int to string* : 

select cast(f1 as string) as kk, count(*) from test.my_table group by kk;

2) *group by* *int :* 

select f1 as kk, count(*) from test.my_table  group by kk;

!image-2020-11-12-20-43-47-010.png!

from the benchmark, we can see that performance decreased seriously with more thread.

But using snprintf, performance improved significantly with more thread .

 

  was:
For float_type, lexical_cast is replace by snprintf (issue https://issues.apache.org/jira/browse/IMPALA-1738),but why not do the same replacement for num_type.

Test is done in 2 sql case :

1)  *group by cast int to string* : 

select cast(f1 as string) as kk,count(*) from test.my_table group by kk;

2) *group by* *int :* 

select f1 as kk,count(*) from test.my_table  group by kk;

!image-2020-11-12-20-43-47-010.png!

from the benchmark, we can see that performance decreased seriously with more thread.

But using snprintf, performance improved significantly with more thread .

 


> use snprintf instead of lexical_cast when casting int to string,  to improve multi-thread performance
> -----------------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-10323
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10323
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 3.4.0
>         Environment: 3 nodes ( 24 core, 64 GB mem)
> impala 3.4
>            Reporter: wesleydeng
>            Priority: Major
>             Fix For: Impala 3.2.0, Impala 3.3.0, Impala 3.4.0
>
>         Attachments: image-2020-11-12-20-36-50-082.png, image-2020-11-12-20-43-47-010.png
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> For float_type, lexical_cast is replace by snprintf (issue https://issues.apache.org/jira/browse/IMPALA-1738),but why not do the same replacement for num_type.
> Test is done in 2 sql case :
> 1)  *group by cast int to string* : 
> select cast(f1 as string) as kk, count(*) from test.my_table group by kk;
> 2) *group by* *int :* 
> select f1 as kk, count(*) from test.my_table  group by kk;
> !image-2020-11-12-20-43-47-010.png!
> from the benchmark, we can see that performance decreased seriously with more thread.
> But using snprintf, performance improved significantly with more thread .
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org