You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "boyingking@163.com" <bo...@163.com> on 2014/09/15 04:41:40 UTC

About SparkSQL 1.1.0 join between more than two table

Hi:
When I use spark SQL (1.0.1), I found it not support join between three tables,eg:
sql("SELECT * FROM youhao_data left join youhao_age on (youhao_data.rowkey=youhao_age.rowkey) left join youhao_totalKiloMeter on (youhao_age.rowkey=youhao_totalKiloMeter.rowkey)") 
I take the Exception:
Exception in thread "main" java.lang.RuntimeException: [1.90] failure: ``UNION'' expected but `left' found

If the Spark SQL 1.1.0 has support join between three tables?




boyingking@163.com

Re: About SparkSQL 1.1.0 join between more than two table

Posted by Yin Huai <hu...@gmail.com>.
1.0.1 does not have the support on outer joins (added in 1.1). Your query
should be fine in 1.1.

On Mon, Sep 15, 2014 at 5:35 AM, Yanbo Liang <ya...@gmail.com> wrote:

> Spark SQL can support SQL and HiveSQL which used SQLContext and
> HiveContext separate.
> As far as I know, SQLContext of Spark SQL 1.1.0 can not support three
> table join directly.
> However you can modify your query with subquery such as
>
> SELECT * FROM (SELECT * FROM youhao_data left join youhao_age on
> (youhao_data.rowkey=youhao_age.rowkey)) tmp left join
> youhao_totalKiloMeter on (tmp.rowkey=youhao_totalKiloMeter.rowkey)
>
> HiveContext of Spark 1.1.0 can support three table join.
>
> val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
> sqlContext.sql("SELECT * FROM youhao_data left join youhao_age on
> (youhao_data.rowkey=youhao_age.rowkey) left join youhao_totalKiloMeter on
> (youhao_age.rowkey=youhao_totalKiloMeter.rowkey)")
>
> 2014-09-15 10:41 GMT+08:00 boyingking@163.com <bo...@163.com>:
>
>
>> Hi:
>> When I use spark SQL (1.0.1), I found it not support join between three
>> tables,eg:
>>  sql("SELECT * FROM youhao_data left join youhao_age on
>> (youhao_data.rowkey=youhao_age.rowkey) left join youhao_totalKiloMeter on
>> (youhao_age.rowkey=youhao_totalKiloMeter.rowkey)")
>>  I take the Exception:
>>  Exception in thread "main" java.lang.RuntimeException: [1.90] failure:
>> ``UNION'' expected but `left' found
>>
>> If the Spark SQL 1.1.0 has support join between three tables?
>>
>> ------------------------------
>>  boyingking@163.com
>>
>
>

Re: About SparkSQL 1.1.0 join between more than two table

Posted by Yanbo Liang <ya...@gmail.com>.
Spark SQL can support SQL and HiveSQL which used SQLContext and HiveContext
separate.
As far as I know, SQLContext of Spark SQL 1.1.0 can not support three table
join directly.
However you can modify your query with subquery such as

SELECT * FROM (SELECT * FROM youhao_data left join youhao_age on
(youhao_data.rowkey=youhao_age.rowkey)) tmp left join youhao_totalKiloMeter
on (tmp.rowkey=youhao_totalKiloMeter.rowkey)

HiveContext of Spark 1.1.0 can support three table join.

val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
sqlContext.sql("SELECT * FROM youhao_data left join youhao_age on
(youhao_data.rowkey=youhao_age.rowkey) left join youhao_totalKiloMeter on
(youhao_age.rowkey=youhao_totalKiloMeter.rowkey)")

2014-09-15 10:41 GMT+08:00 boyingking@163.com <bo...@163.com>:

>
> Hi:
> When I use spark SQL (1.0.1), I found it not support join between three
> tables,eg:
>  sql("SELECT * FROM youhao_data left join youhao_age on
> (youhao_data.rowkey=youhao_age.rowkey) left join youhao_totalKiloMeter on
> (youhao_age.rowkey=youhao_totalKiloMeter.rowkey)")
>  I take the Exception:
>  Exception in thread "main" java.lang.RuntimeException: [1.90] failure:
> ``UNION'' expected but `left' found
>
> If the Spark SQL 1.1.0 has support join between three tables?
>
> ------------------------------
>  boyingking@163.com
>