You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by gtinside <gt...@gmail.com> on 2014/09/02 22:26:15 UTC
flattening a list in spark sql
Hi ,
I am using jsonRDD in spark sql and having trouble iterating through array
inside the json object. Please refer to the schema below :
-- Preferences: struct (nullable = true)
| |-- destinations: array (nullable = true)
|-- user: string (nullable = true)
Sample Data:
-- Preferences: struct (nullable = true)
| |-- destinations: ("Paris","NYC","LA","EWR")
|-- user: "test1"
-- Preferences: struct (nullable = true)
| |-- destinations: ("Paris","SFO")
|-- user: "test2"
My requirement is to run query for displaying number of user per destination
as follows :
Number of users:10, Destination:Paris
Number of users:20, Destination:NYC
Number of users:30, Destination:SFO
To achieve the above mentioned result, I need to flatten out the
destinations array, but I am not sure how to do it. Can you please help ?
Gaurav
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/flattening-a-list-in-spark-sql-tp13300.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org
Re: flattening a list in spark sql
Posted by gtinside <gt...@gmail.com>.
My bad, please ignore, it works !!!
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/flattening-a-list-in-spark-sql-tp13300p13901.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org
Re: flattening a list in spark sql
Posted by gtinside <gt...@gmail.com>.
Hi ,
Thanks it worked, really appreciate your help. I have also been trying to do
multiple Lateral Views, but it doesn't seem to be working.
Query :
hiveContext.sql("Select t2 from fav LATERAL VIEW explode(TABS) tabs1 as t1
LATERAL VIEW explode(t1) tabs2 as t2")
Exception
org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved
attributes: 't2, tree:
Regards,
Gaurav
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/flattening-a-list-in-spark-sql-tp13300p13894.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org
Re: flattening a list in spark sql
Posted by Michael Armbrust <mi...@databricks.com>.
Yes you can. HiveContext's functionality is a strict superset of
SQLContext.
On Tue, Sep 2, 2014 at 6:35 PM, gtinside <gt...@gmail.com> wrote:
> Thanks . I am not using hive context . I am loading data from Cassandra and
> then converting it into json and then querying it through SQL context . Can
> I use use hive context to query on a jsonRDD ?
>
> Gaurav
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/flattening-a-list-in-spark-sql-tp13300p13320.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>
Re: flattening a list in spark sql
Posted by gtinside <gt...@gmail.com>.
Thanks . I am not using hive context . I am loading data from Cassandra and
then converting it into json and then querying it through SQL context . Can
I use use hive context to query on a jsonRDD ?
Gaurav
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/flattening-a-list-in-spark-sql-tp13300p13320.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org
Re: flattening a list in spark sql
Posted by Michael Armbrust <mi...@databricks.com>.
Check out LATERAL VIEW explode:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView
On Tue, Sep 2, 2014 at 1:26 PM, gtinside <gt...@gmail.com> wrote:
> Hi ,
>
> I am using jsonRDD in spark sql and having trouble iterating through array
> inside the json object. Please refer to the schema below :
>
> -- Preferences: struct (nullable = true)
> | |-- destinations: array (nullable = true)
> |-- user: string (nullable = true)
>
> Sample Data:
>
> -- Preferences: struct (nullable = true)
> | |-- destinations: ("Paris","NYC","LA","EWR")
> |-- user: "test1"
>
> -- Preferences: struct (nullable = true)
> | |-- destinations: ("Paris","SFO")
> |-- user: "test2"
>
>
> My requirement is to run query for displaying number of user per
> destination
> as follows :
>
> Number of users:10, Destination:Paris
> Number of users:20, Destination:NYC
> Number of users:30, Destination:SFO
>
> To achieve the above mentioned result, I need to flatten out the
> destinations array, but I am not sure how to do it. Can you please help ?
>
> Gaurav
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/flattening-a-list-in-spark-sql-tp13300.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>