You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by gu...@apache.org on 2022/06/18 00:31:08 UTC

[spark] branch master updated: [SPARK-39456][DOCS][PYTHON] Fix broken function links in the auto-generated pandas API support list documentation

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new e125c4807f9 [SPARK-39456][DOCS][PYTHON] Fix broken function links in the auto-generated pandas API support list documentation
e125c4807f9 is described below

commit e125c4807f9dd9f8c0f9af873bf129d5b756027a
Author: beobest2 <cl...@naver.com>
AuthorDate: Sat Jun 18 09:30:55 2022 +0900

    [SPARK-39456][DOCS][PYTHON] Fix broken function links in the auto-generated pandas API support list documentation
    
    ### What changes were proposed in this pull request?
    
    In the auto-generated documentation on pandas API support list, there are cases where the link of the function property provided in the document is not connected, so it needs to be corrected.
    
    The current 'supported API generation' function dynamically compares the modules of `PySpark.pandas` and `pandas` to find the difference.
    At this time, the inherited class is also aggregated, and the link is not generated correctly (such as `CategoricalIndex.all()` is used internally by inheriting `Index.all()`.) because it does not match the pattern of each API document.
    
    So, I modified it in such a way that it is created by excluding methods that exist in the parent class.
    
    ### Why are the changes needed?
    
    To link to the correct API document.
    
    ### Does this PR introduce _any_ user-facing change?
    
    Yes, the "Supported pandas APIs" page has changed as below.
    <img width="992" alt="Screen Shot 2022-06-16 at 7 54 05 PM" src="https://user-images.githubusercontent.com/7010554/174196507-ea8a2577-1e2c-44cd-9564-e7dd81f5c799.png">
    
    ### How was this patch tested?
    
    Manually check the links in the documents & the existing doc build should be passed.
    
    Closes #36895 from beobest2/SPARK-39456.
    
    Authored-by: beobest2 <cl...@naver.com>
    Signed-off-by: Hyukjin Kwon <gu...@apache.org>
---
 python/pyspark/pandas/supported_api_gen.py | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/python/pyspark/pandas/supported_api_gen.py b/python/pyspark/pandas/supported_api_gen.py
index 392b5408020..019ed0ce254 100644
--- a/python/pyspark/pandas/supported_api_gen.py
+++ b/python/pyspark/pandas/supported_api_gen.py
@@ -135,11 +135,23 @@ def _create_supported_by_module(
         # module not implemented
         return {}
 
-    pd_funcs = dict([m for m in getmembers(pd_module, isfunction) if not m[0].startswith("_")])
+    pd_funcs = dict(
+        [
+            m
+            for m in getmembers(pd_module, isfunction)
+            if not m[0].startswith("_") and m[0] in pd_module.__dict__
+        ]
+    )
     if not pd_funcs:
         return {}
 
-    ps_funcs = dict([m for m in getmembers(ps_module, isfunction) if not m[0].startswith("_")])
+    ps_funcs = dict(
+        [
+            m
+            for m in getmembers(ps_module, isfunction)
+            if not m[0].startswith("_") and m[0] in ps_module.__dict__
+        ]
+    )
 
     return _organize_by_implementation_status(
         module_name, pd_funcs, ps_funcs, pd_module_group, ps_module_group


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org