You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/08/30 06:54:19 UTC

[GitHub] [spark] PavithraRamachandran opened a new pull request #25628: [SPARK-28897][Core]'coalesce' error when executing dataframe.na.fill

PavithraRamachandran opened a new pull request #25628: [SPARK-28897][Core]'coalesce' error when executing dataframe.na.fill
URL: https://github.com/apache/spark/pull/25628
 
 
   ### What changes were proposed in this pull request?
   **Root Cause:**
   When a dataframe is created using select statement (using **spark.sql.parser.quotedRegexColumnNames=true**) dataframe fill is called- the _fillCol_ in DataFrameNaFunctions, **``(backtick)** are added  explicitly to the **columnNames**, the column name is misunderstood to be a regex and it is set as an unresolvedregex, which makes the coalesce resolving to fail.
   
   _Observation_
   When we create the dataframe from the select statement using a regex, valid columns names are returned after applying the filter(regex). So adding _backticks_ to column name in this flow was not needed. To check the impact, select statement with regex were used, there was no impact while executing without the _backticks_.
   
   **After Fix**
   While passing the columnname to the dataframe column method, **``(backtick)** are not added, as the value that is received is not a regular expression, but a valid column name.
   
   ### Why are the changes needed?
   By doing this change column name is not considered as regex and the proper Column function is derived.
   And does not fail to resolve the expression.
   
   ### Does this PR introduce any user-facing change?
   No
   
   ### How was this patch tested?
   The patch was tested by adding UT cases. And testing in spark shell using various select statement .(with and without regex)
   
   Before Fix:
   ![Before](https://user-images.githubusercontent.com/51401130/63996784-417fe600-cb1a-11e9-9c0c-f15a0e9d362c.png)
   
   
   After Fix:
   ![After](https://user-images.githubusercontent.com/51401130/63996792-4e043e80-cb1a-11e9-8ddf-753f9e1444f8.png)
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org