You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Vadim Tkachenko (JIRA)" <ji...@apache.org> on 2016/01/12 07:16:40 UTC
[jira] [Created] (SPARK-12763) Spark gets stuck executing SSB query
Vadim Tkachenko created SPARK-12763:
---------------------------------------
Summary: Spark gets stuck executing SSB query
Key: SPARK-12763
URL: https://issues.apache.org/jira/browse/SPARK-12763
Project: Spark
Issue Type: Bug
Affects Versions: 1.6.0
Environment: Standalone cluster
Reporter: Vadim Tkachenko
I am trying to emulate SSB load. Data generated with https://github.com/Percona-Lab/ssb-dbgen
generated size is with 1000 scale factor and converted to parquet format.
Now there is a following script
val pLineOrder = sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/lineorder").cache()
val pDate = sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/date").cache()
val pPart = sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/part").cache()
val pSupplier = sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/supplier").cache()
val pCustomer = sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/customer").cache()
pLineOrder.registerTempTable("lineorder")
pDate.registerTempTable("date")
pPart.registerTempTable("part")
pSupplier.registerTempTable("supplier")
pCustomer.registerTempTable("customer")
query
val sql41=sqlContext.sql("select D_YEAR, C_NATION, sum(LO_REVENUE - LO_SUPPLYCOST) as profit from date, customer, supplier, part, lineorder where LO_CUSTKEY = C_CUSTKEY and LO_SUPPKEY = S_SUPPKEY and LO_PARTKEY = P_PARTKEY and LO_ORDERDATE = D_DATEKEY and C_REGION = 'AMERICA' and S_REGION = 'AMERICA' and (P_MFGR = 'MFGR#1' or P_MFGR = 'MFGR#2') group by D_YEAR, C_NATION order by D_YEAR, C_NATION")
and
sql41.show()
get stuck, at some point there is no progress and server is fully idle, but Job is staying at the same stage.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org