You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Arun Patel <ar...@gmail.com> on 2016/09/12 21:28:55 UTC

Check if a nested column exists in DataFrame

I'm trying to analyze XML documents using spark-xml package.  Since all XML
columns are optional, some columns may or may not exist. When I register
the Dataframe as a table, how do I check if a nested column is existing or
not? My column name is "emp" which is already exploded and I am trying to
check if the nested column "emp.mgr.col" exists or not.  If it exists, I
need to use it.  If it does not exist, I should set it to null.  Is there a
way to achieve this?

Please note I am not able to use .columns method because it does not show
the nested columns.

Also, note that I  cannot manually specify the schema because of my
requirement.

I'm trying this in Pyspark.

Thank you.

Re: Check if a nested column exists in DataFrame

Posted by Arun Patel <ar...@gmail.com>.
Is there a way to check nested column exists from Schema in PySpark?

http://stackoverflow.com/questions/37471346/automatically-and-elegantly-flatten-dataframe-in-spark-sql
shows how to get the list of nested columns in Scala.  But, can this be
done in PySpark?

Please help.

On Mon, Sep 12, 2016 at 5:28 PM, Arun Patel <ar...@gmail.com> wrote:

> I'm trying to analyze XML documents using spark-xml package.  Since all
> XML columns are optional, some columns may or may not exist. When I
> register the Dataframe as a table, how do I check if a nested column is
> existing or not? My column name is "emp" which is already exploded and I am
> trying to check if the nested column "emp.mgr.col" exists or not.  If it
> exists, I need to use it.  If it does not exist, I should set it to null.
> Is there a way to achieve this?
>
> Please note I am not able to use .columns method because it does not show
> the nested columns.
>
> Also, note that I  cannot manually specify the schema because of my
> requirement.
>
> I'm trying this in Pyspark.
>
> Thank you.
>