You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Zeming Yu <ze...@gmail.com> on 2017/05/27 10:18:03 UTC
examples for flattening dataframes using pyspark
Hi,
I need to flatten a nested dataframe and I' following this example:
https://docs.databricks.com/spark/latest/spark-sql/complex-types.html
Just wondering:
1. how can I test for the existence of an item before retrieving it
Say test if "b" exists before adding that into my flat dataframe
events = jsonToDataFrame("""{ "a": { "b": 1,
"c": 2,
}}""")
2.
Does anyone know of any examples of how to do these tasks?
Thanks!
Zeming
Re: examples for flattening dataframes using pyspark
Posted by Zeming Yu <ze...@gmail.com>.
Sorry, sent the incomplete email by mistake. Here's the full email:
> Hi,
>
> I need to flatten a nested dataframe and I' following this example:
> https://docs.databricks.com/spark/latest/spark-sql/complex-types.html
>
> Just wondering:
> 1. how can I test for the existence of an item before retrieving it
> Say test if "b" exists before adding that into my flat dataframe
>
> events = jsonToDataFrame("""{ "a": { "b": 1,
>
> "c": 2,
>
> }}""")
>
>
> 2. how can I loop through "b" and "c" and do some aggregation (e.g.
> finding the maximum, minimum)?
>
>
>
> Does anyone know of any examples of how to do these tasks?
>
> Thanks!
> Zeming
>