You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiao Li (JIRA)" <ji...@apache.org> on 2016/03/17 03:05:33 UTC

[jira] [Commented] (SPARK-13863) TPCDS query 66 returns wrong results compared to TPC official result set

    [ https://issues.apache.org/jira/browse/SPARK-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198607#comment-15198607 ] 

Xiao Li commented on SPARK-13863:
---------------------------------

It could be the same issue like https://issues.apache.org/jira/browse/SPARK-13861

> TPCDS query 66 returns wrong results compared to TPC official result set 
> -------------------------------------------------------------------------
>
>                 Key: SPARK-13863
>                 URL: https://issues.apache.org/jira/browse/SPARK-13863
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.6.0
>            Reporter: JESSE CHEN
>              Labels: tpcds-result-mismatch
>
> Testing Spark SQL using TPC queries. Query 66 returns wrong results compared to official result set. This is at 1GB SF (validation run).
> Aggregations slightly off -- eg. JAN_SALES column of "Doors canno"  row - SparkSQL returns 6355232.185385704, expected 6355232.31
> Actual results:
> {noformat}
> [null,null,Fairview,Williamson County,TN,United States,DHL,BARIAN,2001,9597806.850651741,1.1121820530080795E7,8670867.81564045,8994785.945689201,1.088724806326294E7,1.4187671518377304E7,9732598.460139751,1.9798897020946026E7,2.1007842467959404E7,2.149551364927292E7,3.479566905774999E7,3.3122997954660416E7,null,null,null,null,null,null,null,null,null,null,null,null,2.1913594697555542E7,3.2518476414670944E7,2.48856624883976E7,2.5698343830046654E7,3.373591080598068E7,3.552703167087555E7,2.5465193481492043E7,5.362323870799959E7,5.1409986978201866E7,5.415917383586836E7,9.222704311805725E7,8.343539111531019E7]
> [Bad cards must make.,621234,Fairview,Williamson County,TN,United States,DHL,BARIAN,2001,9506753.593884468,8008140.429557085,6116769.711647987,1.1973045160133362E7,7756254.925520897,5352978.574095726,1.373399613500309E7,1.6418794411203384E7,1.7212743279764652E7,1.704270732417488E7,3.43049358570323E7,3.532416421229005E7,15.30301560102066,12.890698882477594,9.846160563729589,19.273003667109915,12.485238936569628,8.61668642427125,22.107605403121994,26.429323590150222,27.707342611261865,27.433635834765774,55.22063482847413,56.86128610521969,3.0534943928382874E7,2.4481686250203133E7,2.217871080008793E7,2.569579825610423E7,2.995490355044937E7,1.8084140250833035E7,3.0805576178061485E7,4.7156887432252884E7,5.115858869637826E7,5.5759943171424866E7,8.625354428184557E7,8.345155532035494E7]
> [Conventional childr,977787,Fairview,Williamson County,TN,United States,DHL,BARIAN,2001,8860645.460736752,1.441581376543355E7,6761497.232810497,1.1820654735879421E7,8246260.600341797,6636877.482845306,1.1434492123092413E7,2.5673812070380323E7,2.3074206999911785E7,2.1834582007320404E7,2.6894900596512794E7,3.357509177109933E7,9.061938296108202,14.743306840276613,6.9151024024767125,12.08919195681618,8.43359606984118,6.787651587559771,11.694256645969329,26.257060147435304,23.598398219562938,22.330611889215547,27.505888906799534,34.337838170377935,2.3836085704864502E7,3.20733132298584E7,2.503790437837982E7,2.2659895963564873E7,2.175740087420273E7,2.4451608012176514E7,2.1933001734852314E7,5.59967034604629E7,5.737188052299309E7,6.208721474336243E7,8.284991027382469E7,8.897031933202875E7]
> [Doors canno,294242,Fairview,Williamson County,TN,United States,DHL,BARIAN,2001,6355232.185385704,1.0198920296742141E7,1.0246200903741479E7,1.2209716492156029E7,8566998.262890816,8806316.75278151,9789405.6993227,1.646658496404171E7,2.6443785668474197E7,2.701604788320923E7,3.366058958298761E7,2.7462468750599384E7,21.59865751791282,34.66167405313361,34.822360178837414,41.495491779406166,29.115484067165177,29.928823053070296,33.26991285854059,55.96272783641258,89.87087386734116,91.81574310672585,114.39763726112386,93.33293258813964,2.2645142994330406E7,2.448725452685547E7,2.4925759290207863E7,3.0503655031727314E7,2.6558160276379585E7,2.0976233452690125E7,2.9895796101181984E7,5.600219855566597E7,5.348815865275085E7,7.628723580410767E7,8.248374754962921E7,8.808826726185608E7]
> [Important issues liv,138504,Fairview,Williamson County,TN,United States,DHL,BARIAN,2001,1.1748784594717264E7,1.435130566355586E7,9896470.867572784,7990874.805492401,8879247.840401173,7362383.04259038,1.0011144724414349E7,1.7741201390372872E7,2.1346976135887742E7,1.8074978020030975E7,2.967512567988676E7,3.2545325348875403E7,84.8263197793368,103.6165429414014,71.45259969078715,57.694180713137534,64.10824120892663,53.156465102743454,72.28054586448297,128.09161750110374,154.12534032149065,130.5014874662896,214.25464737398747,234.97751219369408,2.7204167203903973E7,2.598037822457385E7,1.9943398915802002E7,2.5710421112384796E7,1.948448105346489E7,2.6346611484448195E7,2.5075158296625137E7,5.409477817043829E7,4.106673223178029E7,5.454705814340496E7,7.246596285337901E7,9.277032812079096E7]
> {noformat}
> Expected results:
> {noformat}
> +----------------------+-------------------+----------+-------------------+---------+---------------+---------------+------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+
> | W_WAREHOUSE_NAME     | W_WAREHOUSE_SQ_FT | W_CITY   | W_COUNTY          | W_STATE | W_COUNTRY     | SHIP_CARRIERS | YEAR |      JAN_SALES |      FEB_SALES |      MAR_SALES |      APR_SALES |      MAY_SALES |      JUN_SALES |      JUL_SALES |      AUG_SALES |      SEP_SALES |      OCT_SALES |      NOV_SALES |      DEC_SALES | JAN_SALES_PER_SQ_FOOT | FEB_SALES_PER_SQ_FOOT | MAR_SALES_PER_SQ_FOOT | APR_SALES_PER_SQ_FOOT | MAY_SALES_PER_SQ_FOOT | JUN_SALES_PER_SQ_FOOT | JUL_SALES_PER_SQ_FOOT | AUG_SALES_PER_SQ_FOOT | SEP_SALES_PER_SQ_FOOT | OCT_SALES_PER_SQ_FOOT | NOV_SALES_PER_SQ_FOOT | DEC_SALES_PER_SQ_FOOT |        JAN_NET |        FEB_NET |        MAR_NET |        APR_NET |        MAY_NET |        JUN_NET |        JUL_NET |        AUG_NET |        SEP_NET |        OCT_NET |        NOV_NET |        DEC_NET |
> +----------------------+-------------------+----------+-------------------+---------+---------------+---------------+------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+
> | Bad cards must make. |            621234 | Fairview | Williamson County | TN      | United States | DHL,BARIAN    | 2001 |  9506753.46000 |  8008140.33000 |  6116769.63000 | 11973045.15000 |  7756254.92000 |  5352978.49000 | 13733996.10000 | 16418794.37000 | 17212743.32000 | 17042707.41000 | 34304935.61000 | 35324164.21000 |              15.30302 |              12.89070 |               9.84616 |              19.27300 |              12.48524 |               8.61669 |              22.10761 |              26.42932 |              27.70734 |              27.43364 |              55.22063 |              56.86129 | 30534943.77000 | 24481685.94000 | 22178710.81000 | 25695798.18000 | 29954903.78000 | 18084140.05000 | 30805576.13000 | 47156887.22000 | 51158588.86000 | 55759942.80000 | 86253544.16000 | 83451555.63000 |
> | Conventional childr  |            977787 | Fairview | Williamson County | TN      | United States | DHL,BARIAN    | 2001 |  8860645.55000 | 14415813.74000 |  6761497.23000 | 11820654.76000 |  8246260.69000 |  6636877.49000 | 11434492.25000 | 25673812.14000 | 23074206.96000 | 21834581.94000 | 26894900.53000 | 33575091.74000 |               9.06194 |              14.74331 |               6.91510 |              12.08919 |               8.43360 |               6.78765 |              11.69426 |              26.25706 |              23.59840 |              22.33061 |              27.50589 |              34.33784 | 23836085.83000 | 32073313.37000 | 25037904.18000 | 22659895.86000 | 21757401.03000 | 24451608.10000 | 21933001.85000 | 55996703.43000 | 57371880.44000 | 62087214.51000 | 82849910.15000 | 88970319.31000 |
> | Doors canno          |            294242 | Fairview | Williamson County | TN      | United States | DHL,BARIAN    | 2001 |  6355232.31000 | 10198920.36000 | 10246200.97000 | 12209716.50000 |  8566998.28000 |  8806316.81000 |  9789405.60000 | 16466584.88000 | 26443785.61000 | 27016047.80000 | 33660589.67000 | 27462468.62000 |              21.59866 |              34.66167 |              34.82236 |              41.49549 |              29.11548 |              29.92882 |              33.26991 |              55.96273 |              89.87087 |              91.81574 |             114.39764 |              93.33293 | 22645143.09000 | 24487254.60000 | 24925759.42000 | 30503655.27000 | 26558160.29000 | 20976233.52000 | 29895796.09000 | 56002198.38000 | 53488158.53000 | 76287235.46000 | 82483747.59000 | 88088266.69000 |
> | Important issues liv |            138504 | Fairview | Williamson County | TN      | United States | DHL,BARIAN    | 2001 | 11748784.55000 | 14351305.77000 |  9896470.93000 |  7990874.78000 |  8879247.90000 |  7362383.09000 | 10011144.75000 | 17741201.32000 | 21346976.05000 | 18074978.16000 | 29675125.64000 | 32545325.29000 |              84.82632 |             103.61654 |              71.45260 |              57.69418 |              64.10824 |              53.15647 |              72.28055 |             128.09162 |             154.12534 |             130.50149 |             214.25465 |             234.97751 | 27204167.15000 | 25980378.13000 | 19943398.93000 | 25710421.13000 | 19484481.03000 | 26346611.48000 | 25075158.43000 | 54094778.13000 | 41066732.11000 | 54547058.28000 | 72465962.92000 | 92770328.27000 |
> | [NULL]               |            [NULL] | Fairview | Williamson County | TN      | United States | DHL,BARIAN    | 2001 |  9597806.95000 | 11121820.57000 |  8670867.91000 |  8994786.04000 | 10887248.09000 | 14187671.36000 |  9732598.41000 | 19798897.07000 | 21007842.34000 | 21495513.67000 | 34795669.17000 | 33122997.94000 |                [NULL] |                [NULL] |                [NULL] |                [NULL] |                [NULL] |                [NULL] |                [NULL] |                [NULL] |                [NULL] |                [NULL] |                [NULL] |                [NULL] | 21913594.59000 | 32518476.51000 | 24885662.72000 | 25698343.86000 | 33735910.61000 | 35527031.58000 | 25465193.48000 | 53623238.66000 | 51409986.76000 | 54159173.90000 | 92227043.25000 | 83435390.84000 |
> +----------------------+-------------------+----------+-------------------+---------+---------------+---------------+------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+
> {noformat}
> Query used:
> {noformat}
> -- start query 66 in stream 0 using template query66.tpl and seed QUALIFICATION
>   select   
>          w_warehouse_name
>  	,w_warehouse_sq_ft
>  	,w_city
>  	,w_county
>  	,w_state
>  	,w_country
>         ,ship_carriers
>         ,year
>  	,sum(jan_sales) as jan_sales
>  	,sum(feb_sales) as feb_sales
>  	,sum(mar_sales) as mar_sales
>  	,sum(apr_sales) as apr_sales
>  	,sum(may_sales) as may_sales
>  	,sum(jun_sales) as jun_sales
>  	,sum(jul_sales) as jul_sales
>  	,sum(aug_sales) as aug_sales
>  	,sum(sep_sales) as sep_sales
>  	,sum(oct_sales) as oct_sales
>  	,sum(nov_sales) as nov_sales
>  	,sum(dec_sales) as dec_sales
>  	,sum(jan_sales/w_warehouse_sq_ft) as jan_sales_per_sq_foot
>  	,sum(feb_sales/w_warehouse_sq_ft) as feb_sales_per_sq_foot
>  	,sum(mar_sales/w_warehouse_sq_ft) as mar_sales_per_sq_foot
>  	,sum(apr_sales/w_warehouse_sq_ft) as apr_sales_per_sq_foot
>  	,sum(may_sales/w_warehouse_sq_ft) as may_sales_per_sq_foot
>  	,sum(jun_sales/w_warehouse_sq_ft) as jun_sales_per_sq_foot
>  	,sum(jul_sales/w_warehouse_sq_ft) as jul_sales_per_sq_foot
>  	,sum(aug_sales/w_warehouse_sq_ft) as aug_sales_per_sq_foot
>  	,sum(sep_sales/w_warehouse_sq_ft) as sep_sales_per_sq_foot
>  	,sum(oct_sales/w_warehouse_sq_ft) as oct_sales_per_sq_foot
>  	,sum(nov_sales/w_warehouse_sq_ft) as nov_sales_per_sq_foot
>  	,sum(dec_sales/w_warehouse_sq_ft) as dec_sales_per_sq_foot
>  	,sum(jan_net) as jan_net
>  	,sum(feb_net) as feb_net
>  	,sum(mar_net) as mar_net
>  	,sum(apr_net) as apr_net
>  	,sum(may_net) as may_net
>  	,sum(jun_net) as jun_net
>  	,sum(jul_net) as jul_net
>  	,sum(aug_net) as aug_net
>  	,sum(sep_net) as sep_net
>  	,sum(oct_net) as oct_net
>  	,sum(nov_net) as nov_net
>  	,sum(dec_net) as dec_net
>  from (
>     select 
>  	w_warehouse_name
>  	,w_warehouse_sq_ft
>  	,w_city
>  	,w_county
>  	,w_state
>  	,w_country
>  	,concat('DHL', ',', 'BARIAN') as ship_carriers
>        ,d_year as year
>  	,sum(case when d_moy = 1 
>  		then ws_ext_sales_price* ws_quantity else 0 end) as jan_sales
>  	,sum(case when d_moy = 2 
>  		then ws_ext_sales_price* ws_quantity else 0 end) as feb_sales
>  	,sum(case when d_moy = 3 
>  		then ws_ext_sales_price* ws_quantity else 0 end) as mar_sales
>  	,sum(case when d_moy = 4 
>  		then ws_ext_sales_price* ws_quantity else 0 end) as apr_sales
>  	,sum(case when d_moy = 5 
>  		then ws_ext_sales_price* ws_quantity else 0 end) as may_sales
>  	,sum(case when d_moy = 6 
>  		then ws_ext_sales_price* ws_quantity else 0 end) as jun_sales
>  	,sum(case when d_moy = 7 
>  		then ws_ext_sales_price* ws_quantity else 0 end) as jul_sales
>  	,sum(case when d_moy = 8 
>  		then ws_ext_sales_price* ws_quantity else 0 end) as aug_sales
>  	,sum(case when d_moy = 9 
>  		then ws_ext_sales_price* ws_quantity else 0 end) as sep_sales
>  	,sum(case when d_moy = 10 
>  		then ws_ext_sales_price* ws_quantity else 0 end) as oct_sales
>  	,sum(case when d_moy = 11
>  		then ws_ext_sales_price* ws_quantity else 0 end) as nov_sales
>  	,sum(case when d_moy = 12
>  		then ws_ext_sales_price* ws_quantity else 0 end) as dec_sales
>  	,sum(case when d_moy = 1 
>  		then ws_net_paid * ws_quantity else 0 end) as jan_net
>  	,sum(case when d_moy = 2
>  		then ws_net_paid * ws_quantity else 0 end) as feb_net
>  	,sum(case when d_moy = 3 
>  		then ws_net_paid * ws_quantity else 0 end) as mar_net
>  	,sum(case when d_moy = 4 
>  		then ws_net_paid * ws_quantity else 0 end) as apr_net
>  	,sum(case when d_moy = 5 
>  		then ws_net_paid * ws_quantity else 0 end) as may_net
>  	,sum(case when d_moy = 6 
>  		then ws_net_paid * ws_quantity else 0 end) as jun_net
>  	,sum(case when d_moy = 7 
>  		then ws_net_paid * ws_quantity else 0 end) as jul_net
>  	,sum(case when d_moy = 8 
>  		then ws_net_paid * ws_quantity else 0 end) as aug_net
>  	,sum(case when d_moy = 9 
>  		then ws_net_paid * ws_quantity else 0 end) as sep_net
>  	,sum(case when d_moy = 10 
>  		then ws_net_paid * ws_quantity else 0 end) as oct_net
>  	,sum(case when d_moy = 11
>  		then ws_net_paid * ws_quantity else 0 end) as nov_net
>  	,sum(case when d_moy = 12
>  		then ws_net_paid * ws_quantity else 0 end) as dec_net
>      from
>           web_sales
>          ,warehouse
>          ,date_dim
>          ,time_dim
>  	  ,ship_mode
>      where
>             ws_warehouse_sk =  w_warehouse_sk
>         and ws_sold_date_sk = d_date_sk
>         and ws_sold_time_sk = t_time_sk
>  	and ws_ship_mode_sk = sm_ship_mode_sk
>         and d_year = 2001
>         and t_time between 30838 and 30838+28800
>  	and sm_carrier in ('DHL','BARIAN')
>      group by 
>         w_warehouse_name
>  	,w_warehouse_sq_ft
>  	,w_city
>  	,w_county
>  	,w_state
>  	,w_country
>        ,d_year
>  union all
>     select 
>  	w_warehouse_name
>  	,w_warehouse_sq_ft
>  	,w_city
>  	,w_county
>  	,w_state
>  	,w_country
>  	,concat('DHL', ',', 'BARIAN') as ship_carriers
>        ,d_year as year
>  	,sum(case when d_moy = 1 
>  		then cs_sales_price* cs_quantity else 0 end) as jan_sales
>  	,sum(case when d_moy = 2 
>  		then cs_sales_price* cs_quantity else 0 end) as feb_sales
>  	,sum(case when d_moy = 3 
>  		then cs_sales_price* cs_quantity else 0 end) as mar_sales
>  	,sum(case when d_moy = 4 
>  		then cs_sales_price* cs_quantity else 0 end) as apr_sales
>  	,sum(case when d_moy = 5 
>  		then cs_sales_price* cs_quantity else 0 end) as may_sales
>  	,sum(case when d_moy = 6 
>  		then cs_sales_price* cs_quantity else 0 end) as jun_sales
>  	,sum(case when d_moy = 7 
>  		then cs_sales_price* cs_quantity else 0 end) as jul_sales
>  	,sum(case when d_moy = 8 
>  		then cs_sales_price* cs_quantity else 0 end) as aug_sales
>  	,sum(case when d_moy = 9 
>  		then cs_sales_price* cs_quantity else 0 end) as sep_sales
>  	,sum(case when d_moy = 10 
>  		then cs_sales_price* cs_quantity else 0 end) as oct_sales
>  	,sum(case when d_moy = 11
>  		then cs_sales_price* cs_quantity else 0 end) as nov_sales
>  	,sum(case when d_moy = 12
>  		then cs_sales_price* cs_quantity else 0 end) as dec_sales
>  	,sum(case when d_moy = 1 
>  		then cs_net_paid_inc_tax * cs_quantity else 0 end) as jan_net
>  	,sum(case when d_moy = 2 
>  		then cs_net_paid_inc_tax * cs_quantity else 0 end) as feb_net
>  	,sum(case when d_moy = 3 
>  		then cs_net_paid_inc_tax * cs_quantity else 0 end) as mar_net
>  	,sum(case when d_moy = 4 
>  		then cs_net_paid_inc_tax * cs_quantity else 0 end) as apr_net
>  	,sum(case when d_moy = 5 
>  		then cs_net_paid_inc_tax * cs_quantity else 0 end) as may_net
>  	,sum(case when d_moy = 6 
>  		then cs_net_paid_inc_tax * cs_quantity else 0 end) as jun_net
>  	,sum(case when d_moy = 7 
>  		then cs_net_paid_inc_tax * cs_quantity else 0 end) as jul_net
>  	,sum(case when d_moy = 8 
>  		then cs_net_paid_inc_tax * cs_quantity else 0 end) as aug_net
>  	,sum(case when d_moy = 9 
>  		then cs_net_paid_inc_tax * cs_quantity else 0 end) as sep_net
>  	,sum(case when d_moy = 10 
>  		then cs_net_paid_inc_tax * cs_quantity else 0 end) as oct_net
>  	,sum(case when d_moy = 11
>  		then cs_net_paid_inc_tax * cs_quantity else 0 end) as nov_net
>  	,sum(case when d_moy = 12
>  		then cs_net_paid_inc_tax * cs_quantity else 0 end) as dec_net
>      from
>           catalog_sales
>          ,warehouse
>          ,date_dim
>          ,time_dim
>  	 ,ship_mode
>      where
>             cs_warehouse_sk =  w_warehouse_sk
>         and cs_sold_date_sk = d_date_sk
>         and cs_sold_time_sk = t_time_sk
>  	and cs_ship_mode_sk = sm_ship_mode_sk
>         and d_year = 2001
>         and t_time between 30838 and 30838+28800
>  	and sm_carrier in ('DHL','BARIAN')
>      group by 
>         w_warehouse_name
>  	,w_warehouse_sq_ft
>  	,w_city
>  	,w_county
>  	,w_state
>  	,w_country
>        ,d_year
>  ) x
>  group by 
>         w_warehouse_name
>  	,w_warehouse_sq_ft
>  	,w_city
>  	,w_county
>  	,w_state
>  	,w_country
>  	,ship_carriers
>        ,year
>  order by w_warehouse_name
>    limit 100;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org