You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Vova Vysotskyi (Jira)" <ji...@apache.org> on 2019/11/27 15:02:00 UTC

[jira] [Commented] (DRILL-7433) Allow parallel computation of metadata aggregating

    [ https://issues.apache.org/jira/browse/DRILL-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16983600#comment-16983600 ] 

Vova Vysotskyi commented on DRILL-7433:
---------------------------------------

Also, metastore usage may cause some performance degradation during the planning stage due to the slow data obtaining from the metastore.
 Here is the table which shows query execution times with and without metastore usage.
 TPCH sf100 queries:
||with metastore||without metastore||%||
|01.q|01.q|0,2136413033|
|52268|51002| |
|34625|20796| |
|32959|19425| |
|15950|18197| |
|22416|17754| |
|27163|16909| |
|27421|16970| |
|25227|18030| |
|18330|17604| |
|15096|16774| |
|03.q|03.q|0,04156541396|
|35973|31481| |
|36750|29711| |
|32625|27854| |
|32278|29428| |
|23958|24997| |
|26034|26919| |
|22929|26388| |
|24210|25401| |
|24913|26183| |
|23450|22990| |
|04.q|04.q|0,02322124287|
|37555|33995| |
|34862|33037| |
|34204|33740| |
|34158|33768| |
|33750|32218| |
|32338|33131| |
|34014|32649| |
|33207|34177| |
|33882|33252| |
|34432|34484| |
|05.q|05.q|0,008933931101|
|70664|72823| |
|54647|62721| |
|62286|55560| |
|60189|57614| |
|55352|55113| |
|56105|55775| |
|55217|54555| |
|56984|56259| |
|56883|53271| |
|54731|54158| |
|06.q|06.q|0,07869497656|
|20456|24143| |
|19177|15156| |
|25584|13697| |
|11072|14983| |
|14532|13418| |
|17686|13136| |
|13528|11195| |
|11683|13069| |
|11583|11878| |
|11723|13992| |
|07.q|07.q|0,00583347923|
|123379|119524| |
|120580|122552| |
|117579|117693| |
|123118|130831| |
|122657|117540| |
|130975|126466| |
|131280|129346| |
|120486|133515| |
|130911|125968| |
|129918|120151| |
|08.q|08.q|0,0450149532|
|36860|34679| |
|42652|34999| |
|32632|29943| |
|31610|32049| |
|30963|31466| |
|30498|31426| |
|32913|29987| |
|32092|32066| |
|33344|32311| |
|31147|30718| |
|09.q|09.q|-0,05301775963|
|89685|97797| |
|89678|92092| |
|89502|93680| |
|90278|95763| |
|88865|97145| |
|92376|93754| |
|90427|94064| |
|89515|93743| |
|89038|92281| |
|90542|97298| |
|10.q|10.q|0,03454388863|
|81791|73476| |
|72593|74029| |
|80857|73800| |
|69306|70520| |
|70328|68953| |
|77046|69310| |
|74954|71102| |
|69848|68238| |
|70171|72197| |
|69242|69082| |
|12.q|12.q|0,00893469279|
|96569|97069| |
|80224|82925| |
|67306|78758| |
|78786|73690| |
|85805|73228| |
|67522|69948| |
|66085|74375| |
|81949|75093| |
|80095|72633| |
|68713|68428| |
|13.q|13.q|0,01470582191|
|25164|24717| |
|23648|25591| |
|25008|24840| |
|24153|23093| |
|24569|22339| |
|24822|23667| |
|23371|23373| |
|24631|25122| |
|24369|23937| |
|23570|23048| |
|14.q|14.q|0,01776126574|
|54458|59139| |
|47063|41498| |
|45234|40933| |
|44159|44274| |
|42809|40621| |
|42680|43721| |
|43317|43910| |
|45529|43257| |
|42988|43512| |
|43420|42770| |
|17.q|17.q|0,04398839931|
|32576|26967| |
|25589|25727| |
|26817|28556| |
|25937|24733| |
|27532|25848| |
|26882|24904| |
|26676|25672| |
|26564|24844| |
|26324|26747| |
|26811|25758| |
|18.q|18.q|0,05297477469|
|41809|34913| |
|22739|22337| |
|22473|21648| |
|25921|23012| |
|21859|21672| |
|22166|22257| |
|22273|21443| |
|22570|21811| |
|22816|23441| |
|22586|21582| |
|19.q|19.q|-0,008418635606|
|53345|55810| |
|49208|48747| |
|48738|49688| |
|47019|48855| |
|49612|51102| |
|51989|49797| |
|50560|49765| |
|47426|49787| |
|51627|48363| |
|47825|49622| |

TPCH sf100 limit 0 queries:
||with metastore||without metastore||%||
|01.q|01.q|0,6049271044|
|7359|1076| |
|1497|720| |
|1317|703| |
|1169|695| |
|1476|723| |
|1243|700| |
|1191|744| |
|1189|703| |
|1147|669| |
|1206|692| |
|03.q|03.q|0,01374098248|
|2802|933| |
|1650|852| |
|1518|1099| |
|1678|921| |
|1640|923| |
|1695|900| |
|1558|4567| |
|1712|1528| |
|1614|4528| |
|1599|975| |
|04.q|04.q|0,4468584457|
|1909|964| |
|1501|798| |
|1332|859| |
|1264|802| |
|1303|833| |
|1301|768| |
|1343|804| |
|1296|805| |
|1310|800| |
|2354|816| |
|05.q|05.q|0,4369327897|
|2680|1245| |
|2408|1194| |
|2294|1215| |
|2223|1875| |
|2255|1167| |
|2199|1287| |
|2283|1219| |
|2058|1215| |
|2198|1243| |
|2211|1183| |
|06.q|06.q|0,3327788559|
|1324|685| |
|991|1403| |
|1016|635| |
|1012|657| |
|998|650| |
|988|634| |
|1484|664| |
|967|599| |
|1053|616| |
|988|677| |
|07.q|07.q|0,258553351|
|2272|1083| |
|1897|1054| |
|1907|1030| |
|1957|995| |
|1833|988| |
|1787|1155| |
|1873|1049| |
|1890|2008| |
|1877|2026| |
|1910|2850| |
|08.q|08.q|0,3703482672|
|2620|1370| |
|2339|1229| |
|2333|1262| |
|2456|1178| |
|2224|1239| |
|2210|1176| |
|2260|1230| |
|2282|2407| |
|2395|2427| |
|2340|1253| |
|09.q|09.q|0,4529243278|
|2575|1362| |
|2112|1165| |
|2155|1144| |
|2181|1146| |
|2243|1115| |
|2132|1204| |
|2065|1152| |
|2033|1227| |
|2098|1176| |
|2052|1151| |
|10.q|10.q|0,4167757287|
|2028|1028| |
|1613|997| |
|1635|939| |
|1721|973| |
|1753|998| |
|1617|988| |
|1579|1002| |
|1645|936| |
|1608|985| |
|1611|958| |
|12.q|12.q|0,3857109683|
|1448|799| |
|1203|739| |
|1248|784| |
|1269|758| |
|1204|800| |
|1519|815| |
|1231|801| |
|1241|816| |
|1285|809| |
|1271|815| |
|13.q|13.q|0,1735892294|
|729|544| |
|733|549| |
|615|549| |
|623|583| |
|633|599| |
|616|549| |
|632|583| |
|1009|658| |
|623|575| |
|769|581| |
|14.q|14.q|0,2801660967|
|1708|777| |
|1178|673| |
|1159|711| |
|1179|2517| |
|1143|707| |
|1115|723| |
|1217|677| |
|1234|683| |
|1160|673| |
|1189|700| |
|17.q|17.q|0,2704749362|
|1559|740| |
|1173|795| |
|1164|735| |
|1137|754| |
|1199|779| |
|1163|836| |
|1266|1186| |
|1140|1244| |
|1144|933| |
|1204|861| |
|18.q|18.q|0,4232437682|
|1761|891| |
|1481|907| |
|1561|871| |
|1453|861| |
|1487|889| |
|1587|842| |
|1562|921| |
|1536|959| |
|1515|863| |
|1502|904| |
|19.q|19.q|0,5957339773|
|6693|846| |
|1278|815| |
|1313|842| |
|1192|731| |
|2217|826| |
|2401|790| |
|1218|776| |
|1163|743| |
|1193|844| |
|1210|823| |

Please note that some metastore optimizations like segment filtering wasn't done for these queries since tables do not have segments.

There are a lot of issues in Iceberg project for improving performance, for example, this one: [https://github.com/apache/incubator-iceberg/issues/647] promises performance gain to 60%

> Allow parallel computation of metadata aggregating
> --------------------------------------------------
>
>                 Key: DRILL-7433
>                 URL: https://issues.apache.org/jira/browse/DRILL-7433
>             Project: Apache Drill
>          Issue Type: Sub-task
>    Affects Versions: 1.17.0
>            Reporter: Vova Vysotskyi
>            Assignee: Vova Vysotskyi
>            Priority: Major
>
> In the scope of DRILL-7273 was implemented metadata collecting, but every aggregation is created using SINGLE distribution traits, and therefore metadata for every level is calculated using a single aggregate operator.
> The aim of this Jira is to allow creating several aggregate operators for the same metadata level to make metadata calculations more scalable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)