You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiao Li (JIRA)" <ji...@apache.org> on 2016/03/17 03:05:33 UTC
[jira] [Commented] (SPARK-13863) TPCDS query 66 returns wrong
results compared to TPC official result set
[ https://issues.apache.org/jira/browse/SPARK-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198607#comment-15198607 ]
Xiao Li commented on SPARK-13863:
---------------------------------
It could be the same issue like https://issues.apache.org/jira/browse/SPARK-13861
> TPCDS query 66 returns wrong results compared to TPC official result set
> -------------------------------------------------------------------------
>
> Key: SPARK-13863
> URL: https://issues.apache.org/jira/browse/SPARK-13863
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.6.0
> Reporter: JESSE CHEN
> Labels: tpcds-result-mismatch
>
> Testing Spark SQL using TPC queries. Query 66 returns wrong results compared to official result set. This is at 1GB SF (validation run).
> Aggregations slightly off -- eg. JAN_SALES column of "Doors canno" row - SparkSQL returns 6355232.185385704, expected 6355232.31
> Actual results:
> {noformat}
> [null,null,Fairview,Williamson County,TN,United States,DHL,BARIAN,2001,9597806.850651741,1.1121820530080795E7,8670867.81564045,8994785.945689201,1.088724806326294E7,1.4187671518377304E7,9732598.460139751,1.9798897020946026E7,2.1007842467959404E7,2.149551364927292E7,3.479566905774999E7,3.3122997954660416E7,null,null,null,null,null,null,null,null,null,null,null,null,2.1913594697555542E7,3.2518476414670944E7,2.48856624883976E7,2.5698343830046654E7,3.373591080598068E7,3.552703167087555E7,2.5465193481492043E7,5.362323870799959E7,5.1409986978201866E7,5.415917383586836E7,9.222704311805725E7,8.343539111531019E7]
> [Bad cards must make.,621234,Fairview,Williamson County,TN,United States,DHL,BARIAN,2001,9506753.593884468,8008140.429557085,6116769.711647987,1.1973045160133362E7,7756254.925520897,5352978.574095726,1.373399613500309E7,1.6418794411203384E7,1.7212743279764652E7,1.704270732417488E7,3.43049358570323E7,3.532416421229005E7,15.30301560102066,12.890698882477594,9.846160563729589,19.273003667109915,12.485238936569628,8.61668642427125,22.107605403121994,26.429323590150222,27.707342611261865,27.433635834765774,55.22063482847413,56.86128610521969,3.0534943928382874E7,2.4481686250203133E7,2.217871080008793E7,2.569579825610423E7,2.995490355044937E7,1.8084140250833035E7,3.0805576178061485E7,4.7156887432252884E7,5.115858869637826E7,5.5759943171424866E7,8.625354428184557E7,8.345155532035494E7]
> [Conventional childr,977787,Fairview,Williamson County,TN,United States,DHL,BARIAN,2001,8860645.460736752,1.441581376543355E7,6761497.232810497,1.1820654735879421E7,8246260.600341797,6636877.482845306,1.1434492123092413E7,2.5673812070380323E7,2.3074206999911785E7,2.1834582007320404E7,2.6894900596512794E7,3.357509177109933E7,9.061938296108202,14.743306840276613,6.9151024024767125,12.08919195681618,8.43359606984118,6.787651587559771,11.694256645969329,26.257060147435304,23.598398219562938,22.330611889215547,27.505888906799534,34.337838170377935,2.3836085704864502E7,3.20733132298584E7,2.503790437837982E7,2.2659895963564873E7,2.175740087420273E7,2.4451608012176514E7,2.1933001734852314E7,5.59967034604629E7,5.737188052299309E7,6.208721474336243E7,8.284991027382469E7,8.897031933202875E7]
> [Doors canno,294242,Fairview,Williamson County,TN,United States,DHL,BARIAN,2001,6355232.185385704,1.0198920296742141E7,1.0246200903741479E7,1.2209716492156029E7,8566998.262890816,8806316.75278151,9789405.6993227,1.646658496404171E7,2.6443785668474197E7,2.701604788320923E7,3.366058958298761E7,2.7462468750599384E7,21.59865751791282,34.66167405313361,34.822360178837414,41.495491779406166,29.115484067165177,29.928823053070296,33.26991285854059,55.96272783641258,89.87087386734116,91.81574310672585,114.39763726112386,93.33293258813964,2.2645142994330406E7,2.448725452685547E7,2.4925759290207863E7,3.0503655031727314E7,2.6558160276379585E7,2.0976233452690125E7,2.9895796101181984E7,5.600219855566597E7,5.348815865275085E7,7.628723580410767E7,8.248374754962921E7,8.808826726185608E7]
> [Important issues liv,138504,Fairview,Williamson County,TN,United States,DHL,BARIAN,2001,1.1748784594717264E7,1.435130566355586E7,9896470.867572784,7990874.805492401,8879247.840401173,7362383.04259038,1.0011144724414349E7,1.7741201390372872E7,2.1346976135887742E7,1.8074978020030975E7,2.967512567988676E7,3.2545325348875403E7,84.8263197793368,103.6165429414014,71.45259969078715,57.694180713137534,64.10824120892663,53.156465102743454,72.28054586448297,128.09161750110374,154.12534032149065,130.5014874662896,214.25464737398747,234.97751219369408,2.7204167203903973E7,2.598037822457385E7,1.9943398915802002E7,2.5710421112384796E7,1.948448105346489E7,2.6346611484448195E7,2.5075158296625137E7,5.409477817043829E7,4.106673223178029E7,5.454705814340496E7,7.246596285337901E7,9.277032812079096E7]
> {noformat}
> Expected results:
> {noformat}
> +----------------------+-------------------+----------+-------------------+---------+---------------+---------------+------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+
> | W_WAREHOUSE_NAME | W_WAREHOUSE_SQ_FT | W_CITY | W_COUNTY | W_STATE | W_COUNTRY | SHIP_CARRIERS | YEAR | JAN_SALES | FEB_SALES | MAR_SALES | APR_SALES | MAY_SALES | JUN_SALES | JUL_SALES | AUG_SALES | SEP_SALES | OCT_SALES | NOV_SALES | DEC_SALES | JAN_SALES_PER_SQ_FOOT | FEB_SALES_PER_SQ_FOOT | MAR_SALES_PER_SQ_FOOT | APR_SALES_PER_SQ_FOOT | MAY_SALES_PER_SQ_FOOT | JUN_SALES_PER_SQ_FOOT | JUL_SALES_PER_SQ_FOOT | AUG_SALES_PER_SQ_FOOT | SEP_SALES_PER_SQ_FOOT | OCT_SALES_PER_SQ_FOOT | NOV_SALES_PER_SQ_FOOT | DEC_SALES_PER_SQ_FOOT | JAN_NET | FEB_NET | MAR_NET | APR_NET | MAY_NET | JUN_NET | JUL_NET | AUG_NET | SEP_NET | OCT_NET | NOV_NET | DEC_NET |
> +----------------------+-------------------+----------+-------------------+---------+---------------+---------------+------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+
> | Bad cards must make. | 621234 | Fairview | Williamson County | TN | United States | DHL,BARIAN | 2001 | 9506753.46000 | 8008140.33000 | 6116769.63000 | 11973045.15000 | 7756254.92000 | 5352978.49000 | 13733996.10000 | 16418794.37000 | 17212743.32000 | 17042707.41000 | 34304935.61000 | 35324164.21000 | 15.30302 | 12.89070 | 9.84616 | 19.27300 | 12.48524 | 8.61669 | 22.10761 | 26.42932 | 27.70734 | 27.43364 | 55.22063 | 56.86129 | 30534943.77000 | 24481685.94000 | 22178710.81000 | 25695798.18000 | 29954903.78000 | 18084140.05000 | 30805576.13000 | 47156887.22000 | 51158588.86000 | 55759942.80000 | 86253544.16000 | 83451555.63000 |
> | Conventional childr | 977787 | Fairview | Williamson County | TN | United States | DHL,BARIAN | 2001 | 8860645.55000 | 14415813.74000 | 6761497.23000 | 11820654.76000 | 8246260.69000 | 6636877.49000 | 11434492.25000 | 25673812.14000 | 23074206.96000 | 21834581.94000 | 26894900.53000 | 33575091.74000 | 9.06194 | 14.74331 | 6.91510 | 12.08919 | 8.43360 | 6.78765 | 11.69426 | 26.25706 | 23.59840 | 22.33061 | 27.50589 | 34.33784 | 23836085.83000 | 32073313.37000 | 25037904.18000 | 22659895.86000 | 21757401.03000 | 24451608.10000 | 21933001.85000 | 55996703.43000 | 57371880.44000 | 62087214.51000 | 82849910.15000 | 88970319.31000 |
> | Doors canno | 294242 | Fairview | Williamson County | TN | United States | DHL,BARIAN | 2001 | 6355232.31000 | 10198920.36000 | 10246200.97000 | 12209716.50000 | 8566998.28000 | 8806316.81000 | 9789405.60000 | 16466584.88000 | 26443785.61000 | 27016047.80000 | 33660589.67000 | 27462468.62000 | 21.59866 | 34.66167 | 34.82236 | 41.49549 | 29.11548 | 29.92882 | 33.26991 | 55.96273 | 89.87087 | 91.81574 | 114.39764 | 93.33293 | 22645143.09000 | 24487254.60000 | 24925759.42000 | 30503655.27000 | 26558160.29000 | 20976233.52000 | 29895796.09000 | 56002198.38000 | 53488158.53000 | 76287235.46000 | 82483747.59000 | 88088266.69000 |
> | Important issues liv | 138504 | Fairview | Williamson County | TN | United States | DHL,BARIAN | 2001 | 11748784.55000 | 14351305.77000 | 9896470.93000 | 7990874.78000 | 8879247.90000 | 7362383.09000 | 10011144.75000 | 17741201.32000 | 21346976.05000 | 18074978.16000 | 29675125.64000 | 32545325.29000 | 84.82632 | 103.61654 | 71.45260 | 57.69418 | 64.10824 | 53.15647 | 72.28055 | 128.09162 | 154.12534 | 130.50149 | 214.25465 | 234.97751 | 27204167.15000 | 25980378.13000 | 19943398.93000 | 25710421.13000 | 19484481.03000 | 26346611.48000 | 25075158.43000 | 54094778.13000 | 41066732.11000 | 54547058.28000 | 72465962.92000 | 92770328.27000 |
> | [NULL] | [NULL] | Fairview | Williamson County | TN | United States | DHL,BARIAN | 2001 | 9597806.95000 | 11121820.57000 | 8670867.91000 | 8994786.04000 | 10887248.09000 | 14187671.36000 | 9732598.41000 | 19798897.07000 | 21007842.34000 | 21495513.67000 | 34795669.17000 | 33122997.94000 | [NULL] | [NULL] | [NULL] | [NULL] | [NULL] | [NULL] | [NULL] | [NULL] | [NULL] | [NULL] | [NULL] | [NULL] | 21913594.59000 | 32518476.51000 | 24885662.72000 | 25698343.86000 | 33735910.61000 | 35527031.58000 | 25465193.48000 | 53623238.66000 | 51409986.76000 | 54159173.90000 | 92227043.25000 | 83435390.84000 |
> +----------------------+-------------------+----------+-------------------+---------+---------------+---------------+------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+----------------+
> {noformat}
> Query used:
> {noformat}
> -- start query 66 in stream 0 using template query66.tpl and seed QUALIFICATION
> select
> w_warehouse_name
> ,w_warehouse_sq_ft
> ,w_city
> ,w_county
> ,w_state
> ,w_country
> ,ship_carriers
> ,year
> ,sum(jan_sales) as jan_sales
> ,sum(feb_sales) as feb_sales
> ,sum(mar_sales) as mar_sales
> ,sum(apr_sales) as apr_sales
> ,sum(may_sales) as may_sales
> ,sum(jun_sales) as jun_sales
> ,sum(jul_sales) as jul_sales
> ,sum(aug_sales) as aug_sales
> ,sum(sep_sales) as sep_sales
> ,sum(oct_sales) as oct_sales
> ,sum(nov_sales) as nov_sales
> ,sum(dec_sales) as dec_sales
> ,sum(jan_sales/w_warehouse_sq_ft) as jan_sales_per_sq_foot
> ,sum(feb_sales/w_warehouse_sq_ft) as feb_sales_per_sq_foot
> ,sum(mar_sales/w_warehouse_sq_ft) as mar_sales_per_sq_foot
> ,sum(apr_sales/w_warehouse_sq_ft) as apr_sales_per_sq_foot
> ,sum(may_sales/w_warehouse_sq_ft) as may_sales_per_sq_foot
> ,sum(jun_sales/w_warehouse_sq_ft) as jun_sales_per_sq_foot
> ,sum(jul_sales/w_warehouse_sq_ft) as jul_sales_per_sq_foot
> ,sum(aug_sales/w_warehouse_sq_ft) as aug_sales_per_sq_foot
> ,sum(sep_sales/w_warehouse_sq_ft) as sep_sales_per_sq_foot
> ,sum(oct_sales/w_warehouse_sq_ft) as oct_sales_per_sq_foot
> ,sum(nov_sales/w_warehouse_sq_ft) as nov_sales_per_sq_foot
> ,sum(dec_sales/w_warehouse_sq_ft) as dec_sales_per_sq_foot
> ,sum(jan_net) as jan_net
> ,sum(feb_net) as feb_net
> ,sum(mar_net) as mar_net
> ,sum(apr_net) as apr_net
> ,sum(may_net) as may_net
> ,sum(jun_net) as jun_net
> ,sum(jul_net) as jul_net
> ,sum(aug_net) as aug_net
> ,sum(sep_net) as sep_net
> ,sum(oct_net) as oct_net
> ,sum(nov_net) as nov_net
> ,sum(dec_net) as dec_net
> from (
> select
> w_warehouse_name
> ,w_warehouse_sq_ft
> ,w_city
> ,w_county
> ,w_state
> ,w_country
> ,concat('DHL', ',', 'BARIAN') as ship_carriers
> ,d_year as year
> ,sum(case when d_moy = 1
> then ws_ext_sales_price* ws_quantity else 0 end) as jan_sales
> ,sum(case when d_moy = 2
> then ws_ext_sales_price* ws_quantity else 0 end) as feb_sales
> ,sum(case when d_moy = 3
> then ws_ext_sales_price* ws_quantity else 0 end) as mar_sales
> ,sum(case when d_moy = 4
> then ws_ext_sales_price* ws_quantity else 0 end) as apr_sales
> ,sum(case when d_moy = 5
> then ws_ext_sales_price* ws_quantity else 0 end) as may_sales
> ,sum(case when d_moy = 6
> then ws_ext_sales_price* ws_quantity else 0 end) as jun_sales
> ,sum(case when d_moy = 7
> then ws_ext_sales_price* ws_quantity else 0 end) as jul_sales
> ,sum(case when d_moy = 8
> then ws_ext_sales_price* ws_quantity else 0 end) as aug_sales
> ,sum(case when d_moy = 9
> then ws_ext_sales_price* ws_quantity else 0 end) as sep_sales
> ,sum(case when d_moy = 10
> then ws_ext_sales_price* ws_quantity else 0 end) as oct_sales
> ,sum(case when d_moy = 11
> then ws_ext_sales_price* ws_quantity else 0 end) as nov_sales
> ,sum(case when d_moy = 12
> then ws_ext_sales_price* ws_quantity else 0 end) as dec_sales
> ,sum(case when d_moy = 1
> then ws_net_paid * ws_quantity else 0 end) as jan_net
> ,sum(case when d_moy = 2
> then ws_net_paid * ws_quantity else 0 end) as feb_net
> ,sum(case when d_moy = 3
> then ws_net_paid * ws_quantity else 0 end) as mar_net
> ,sum(case when d_moy = 4
> then ws_net_paid * ws_quantity else 0 end) as apr_net
> ,sum(case when d_moy = 5
> then ws_net_paid * ws_quantity else 0 end) as may_net
> ,sum(case when d_moy = 6
> then ws_net_paid * ws_quantity else 0 end) as jun_net
> ,sum(case when d_moy = 7
> then ws_net_paid * ws_quantity else 0 end) as jul_net
> ,sum(case when d_moy = 8
> then ws_net_paid * ws_quantity else 0 end) as aug_net
> ,sum(case when d_moy = 9
> then ws_net_paid * ws_quantity else 0 end) as sep_net
> ,sum(case when d_moy = 10
> then ws_net_paid * ws_quantity else 0 end) as oct_net
> ,sum(case when d_moy = 11
> then ws_net_paid * ws_quantity else 0 end) as nov_net
> ,sum(case when d_moy = 12
> then ws_net_paid * ws_quantity else 0 end) as dec_net
> from
> web_sales
> ,warehouse
> ,date_dim
> ,time_dim
> ,ship_mode
> where
> ws_warehouse_sk = w_warehouse_sk
> and ws_sold_date_sk = d_date_sk
> and ws_sold_time_sk = t_time_sk
> and ws_ship_mode_sk = sm_ship_mode_sk
> and d_year = 2001
> and t_time between 30838 and 30838+28800
> and sm_carrier in ('DHL','BARIAN')
> group by
> w_warehouse_name
> ,w_warehouse_sq_ft
> ,w_city
> ,w_county
> ,w_state
> ,w_country
> ,d_year
> union all
> select
> w_warehouse_name
> ,w_warehouse_sq_ft
> ,w_city
> ,w_county
> ,w_state
> ,w_country
> ,concat('DHL', ',', 'BARIAN') as ship_carriers
> ,d_year as year
> ,sum(case when d_moy = 1
> then cs_sales_price* cs_quantity else 0 end) as jan_sales
> ,sum(case when d_moy = 2
> then cs_sales_price* cs_quantity else 0 end) as feb_sales
> ,sum(case when d_moy = 3
> then cs_sales_price* cs_quantity else 0 end) as mar_sales
> ,sum(case when d_moy = 4
> then cs_sales_price* cs_quantity else 0 end) as apr_sales
> ,sum(case when d_moy = 5
> then cs_sales_price* cs_quantity else 0 end) as may_sales
> ,sum(case when d_moy = 6
> then cs_sales_price* cs_quantity else 0 end) as jun_sales
> ,sum(case when d_moy = 7
> then cs_sales_price* cs_quantity else 0 end) as jul_sales
> ,sum(case when d_moy = 8
> then cs_sales_price* cs_quantity else 0 end) as aug_sales
> ,sum(case when d_moy = 9
> then cs_sales_price* cs_quantity else 0 end) as sep_sales
> ,sum(case when d_moy = 10
> then cs_sales_price* cs_quantity else 0 end) as oct_sales
> ,sum(case when d_moy = 11
> then cs_sales_price* cs_quantity else 0 end) as nov_sales
> ,sum(case when d_moy = 12
> then cs_sales_price* cs_quantity else 0 end) as dec_sales
> ,sum(case when d_moy = 1
> then cs_net_paid_inc_tax * cs_quantity else 0 end) as jan_net
> ,sum(case when d_moy = 2
> then cs_net_paid_inc_tax * cs_quantity else 0 end) as feb_net
> ,sum(case when d_moy = 3
> then cs_net_paid_inc_tax * cs_quantity else 0 end) as mar_net
> ,sum(case when d_moy = 4
> then cs_net_paid_inc_tax * cs_quantity else 0 end) as apr_net
> ,sum(case when d_moy = 5
> then cs_net_paid_inc_tax * cs_quantity else 0 end) as may_net
> ,sum(case when d_moy = 6
> then cs_net_paid_inc_tax * cs_quantity else 0 end) as jun_net
> ,sum(case when d_moy = 7
> then cs_net_paid_inc_tax * cs_quantity else 0 end) as jul_net
> ,sum(case when d_moy = 8
> then cs_net_paid_inc_tax * cs_quantity else 0 end) as aug_net
> ,sum(case when d_moy = 9
> then cs_net_paid_inc_tax * cs_quantity else 0 end) as sep_net
> ,sum(case when d_moy = 10
> then cs_net_paid_inc_tax * cs_quantity else 0 end) as oct_net
> ,sum(case when d_moy = 11
> then cs_net_paid_inc_tax * cs_quantity else 0 end) as nov_net
> ,sum(case when d_moy = 12
> then cs_net_paid_inc_tax * cs_quantity else 0 end) as dec_net
> from
> catalog_sales
> ,warehouse
> ,date_dim
> ,time_dim
> ,ship_mode
> where
> cs_warehouse_sk = w_warehouse_sk
> and cs_sold_date_sk = d_date_sk
> and cs_sold_time_sk = t_time_sk
> and cs_ship_mode_sk = sm_ship_mode_sk
> and d_year = 2001
> and t_time between 30838 and 30838+28800
> and sm_carrier in ('DHL','BARIAN')
> group by
> w_warehouse_name
> ,w_warehouse_sq_ft
> ,w_city
> ,w_county
> ,w_state
> ,w_country
> ,d_year
> ) x
> group by
> w_warehouse_name
> ,w_warehouse_sq_ft
> ,w_city
> ,w_county
> ,w_state
> ,w_country
> ,ship_carriers
> ,year
> order by w_warehouse_name
> limit 100;
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org