You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Zeming Yu <ze...@gmail.com> on 2017/05/27 10:18:03 UTC

examples for flattening dataframes using pyspark

Hi,

I need to flatten a nested dataframe and I' following this example:
https://docs.databricks.com/spark/latest/spark-sql/complex-types.html

Just wondering:
1. how can I test for the existence of an item before retrieving it
Say test if "b" exists before adding that into my flat dataframe

events = jsonToDataFrame("""{  "a": {    "b": 1,

"c": 2,

  }}""")


2.



Does anyone know of any examples of how to do these tasks?

Thanks!
Zeming

Re: examples for flattening dataframes using pyspark

Posted by Zeming Yu <ze...@gmail.com>.

Sorry, sent the incomplete email by mistake. Here's the full email:


> Hi,
>
> I need to flatten a nested dataframe and I' following this example:
> https://docs.databricks.com/spark/latest/spark-sql/complex-types.html
>
> Just wondering:
> 1. how can I test for the existence of an item before retrieving it
> Say test if "b" exists before adding that into my flat dataframe
>
> events = jsonToDataFrame("""{  "a": {    "b": 1,
>
> "c": 2,
>
>   }}""")
>
>
> 2. how can I loop through "b" and "c" and do some aggregation (e.g.
> finding the maximum, minimum)?
>
>
>
> Does anyone know of any examples of how to do these tasks?
>
> Thanks!
> Zeming
>