You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/06/23 15:31:40 UTC

[GitHub] [arrow] pitrou opened a new pull request #10581: ARROW-10316: [Python] Improve introspection of compute function options

pitrou opened a new pull request #10581:
URL: https://github.com/apache/arrow/pull/10581


   Generate a signature for compute functions that better reflects the accepted arguments.
   
   Example before:
   ```python
   >>> pc.sum?
   Signature: pc.sum(array, *, options=None, memory_pool=None, **kwargs)
   Docstring:
   Compute the sum of a numeric array.
   [...]
   ```
   
   Same example after:
   ```python
   >>> ?pc.sum
   Signature:
   pc.sum(
       array,
       *,
       memory_pool=None,
       options=None,
       skip_nulls=True,
       min_count=1,
   )
   Docstring:
   Compute the sum of a numeric array.
   [...]
   ```
   
   One caveat is that the individual options are not explicitly documented (yet):
   ```
   Parameters
   ----------
   array : Array-like
       Argument to compute function
   memory_pool : pyarrow.MemoryPool, optional
       If not passed, will allocate memory from the default memory pool.
   options : pyarrow.compute.ScalarAggregateOptions, optional
       Parameters altering compute function semantics
   **kwargs : optional
       Parameters for ScalarAggregateOptions constructor. Either `options`
       or `**kwargs` can be passed, but not both at the same time.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on a change in pull request #10581: ARROW-10316: [Python] Improve introspection of compute function options

Posted by GitBox <gi...@apache.org>.
bkietz commented on a change in pull request #10581:
URL: https://github.com/apache/arrow/pull/10581#discussion_r661765127



##########
File path: python/pyarrow/compute.py
##########
@@ -173,22 +176,37 @@ def _handle_options(name, option_class, options, kwargs):
     return options
 
 
-_wrapper_template = dedent("""\
-    def make_wrapper(func, option_class):
-        def {func_name}({args_sig}{kwonly}, memory_pool=None):
-            return func.call([{args_sig}], None, memory_pool)
-        return {func_name}
-    """)
-
-_wrapper_options_template = dedent("""\
-    def make_wrapper(func, option_class):
-        def {func_name}({args_sig}{kwonly}, options=None, memory_pool=None,
-                        **kwargs):
-            options = _handle_options({func_name!r}, option_class, options,
+def _make_generic_wrapper(func_name, func, option_class):
+    if option_class is None:
+        def wrapper(*args, memory_pool=None):
+            return func.call(args, None, memory_pool)
+    else:
+        def wrapper(*args, memory_pool=None, options=None, **kwargs):
+            options = _handle_options(func_name, option_class, options,
                                       kwargs)
-            return func.call([{args_sig}], options, memory_pool)
-        return {func_name}
-    """)
+            return func.call(args, options, memory_pool)
+    return wrapper
+
+
+def _make_dummy_equivalent(args_sig, kwonly, option_class):
+    # Make a dummy callable with the signature that we want to expose
+    # as the real function's signature.
+    str_sig = f"{args_sig}{kwonly}, memory_pool=None"
+    if option_class is not None:
+        str_sig += ", options=None"
+        sig = inspect.signature(option_class)
+        for p in sig.parameters.values():
+            p_default = p.default if p.default is not p.empty else None
+            try:
+                # If the default value cannot be represented accurately,
+                # use None instead

Review comment:
       Is there an example of this? If possible it seems better to avoid an inaccurate signature, could we just `assert eval(repr(p_default)) == p_default`? Otherwise please add a test showing the overwritten default




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on a change in pull request #10581: ARROW-10316: [Python] Improve introspection of compute function options

Posted by GitBox <gi...@apache.org>.
bkietz commented on a change in pull request #10581:
URL: https://github.com/apache/arrow/pull/10581#discussion_r662216060



##########
File path: python/pyarrow/compute.py
##########
@@ -197,16 +215,13 @@ def _wrap_function(name, func):
     args_sig = ', '.join(arg_names)
     kwonly = '' if arg_names[-1].startswith('*') else ', *'
 
-    # Generate templated wrapper, so that the signature matches
-    # the documented argument names.
-    ns = {}
-    if option_class is not None:
-        template = _wrapper_options_template
-    else:
-        template = _wrapper_template
-    exec(template.format(func_name=name, args_sig=args_sig, kwonly=kwonly),
-         globals(), ns)
-    wrapper = ns['make_wrapper'](func, option_class)
+    # We make a generic wrapper with a simple argument-handling
+    # machinery, but we also add a __wrapped__ attribute pointing
+    # to a dummy callable with the desired signature for introspection
+    # and documentation.
+    wrapper = _make_generic_wrapper(name, func, option_class)
+    wrapper.__wrapped__ = _make_dummy_equivalent(args_sig, kwonly,
+                                                 option_class)

Review comment:
       The other benefit here is to confine the scope of `eval()`: now instead of using it to create the whole wrapper function we're only using it to get the signature right




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10581: ARROW-10316: [Python] Improve introspection of compute function options

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10581:
URL: https://github.com/apache/arrow/pull/10581#discussion_r662367890



##########
File path: python/pyarrow/compute.py
##########
@@ -197,16 +215,13 @@ def _wrap_function(name, func):
     args_sig = ', '.join(arg_names)
     kwonly = '' if arg_names[-1].startswith('*') else ', *'
 
-    # Generate templated wrapper, so that the signature matches
-    # the documented argument names.
-    ns = {}
-    if option_class is not None:
-        template = _wrapper_options_template
-    else:
-        template = _wrapper_template
-    exec(template.format(func_name=name, args_sig=args_sig, kwonly=kwonly),
-         globals(), ns)
-    wrapper = ns['make_wrapper'](func, option_class)
+    # We make a generic wrapper with a simple argument-handling
+    # machinery, but we also add a __wrapped__ attribute pointing
+    # to a dummy callable with the desired signature for introspection
+    # and documentation.
+    wrapper = _make_generic_wrapper(name, func, option_class)
+    wrapper.__wrapped__ = _make_dummy_equivalent(args_sig, kwonly,
+                                                 option_class)

Review comment:
       Ok, it worked with `__signature__`. Thanks for the suggestion!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #10581: ARROW-10316: [Python] Improve introspection of compute function options

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #10581:
URL: https://github.com/apache/arrow/pull/10581#issuecomment-872356997


   Will merge now. Thank you for the reviews!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #10581: ARROW-10316: [Python] Improve introspection of compute function options

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #10581:
URL: https://github.com/apache/arrow/pull/10581#issuecomment-871591499


   Ping @amol- 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou closed pull request #10581: ARROW-10316: [Python] Improve introspection of compute function options

Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #10581:
URL: https://github.com/apache/arrow/pull/10581


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10581: ARROW-10316: [Python] Improve introspection of compute function options

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10581:
URL: https://github.com/apache/arrow/pull/10581#discussion_r662242293



##########
File path: python/pyarrow/compute.py
##########
@@ -197,16 +215,13 @@ def _wrap_function(name, func):
     args_sig = ', '.join(arg_names)
     kwonly = '' if arg_names[-1].startswith('*') else ', *'
 
-    # Generate templated wrapper, so that the signature matches
-    # the documented argument names.
-    ns = {}
-    if option_class is not None:
-        template = _wrapper_options_template
-    else:
-        template = _wrapper_template
-    exec(template.format(func_name=name, args_sig=args_sig, kwonly=kwonly),
-         globals(), ns)
-    wrapper = ns['make_wrapper'](func, option_class)
+    # We make a generic wrapper with a simple argument-handling
+    # machinery, but we also add a __wrapped__ attribute pointing
+    # to a dummy callable with the desired signature for introspection
+    # and documentation.
+    wrapper = _make_generic_wrapper(name, func, option_class)
+    wrapper.__wrapped__ = _make_dummy_equivalent(args_sig, kwonly,
+                                                 option_class)

Review comment:
       `__text_signature__` is not really specified AFAICT, it's mostly for internal use by Argument Clinic (though other C generators may use it). We can try to set `__signature__`, however.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10581: ARROW-10316: [Python] Improve introspection of compute function options

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10581:
URL: https://github.com/apache/arrow/pull/10581#discussion_r662174593



##########
File path: python/pyarrow/compute.py
##########
@@ -173,22 +176,37 @@ def _handle_options(name, option_class, options, kwargs):
     return options
 
 
-_wrapper_template = dedent("""\
-    def make_wrapper(func, option_class):
-        def {func_name}({args_sig}{kwonly}, memory_pool=None):
-            return func.call([{args_sig}], None, memory_pool)
-        return {func_name}
-    """)
-
-_wrapper_options_template = dedent("""\
-    def make_wrapper(func, option_class):
-        def {func_name}({args_sig}{kwonly}, options=None, memory_pool=None,
-                        **kwargs):
-            options = _handle_options({func_name!r}, option_class, options,
+def _make_generic_wrapper(func_name, func, option_class):
+    if option_class is None:
+        def wrapper(*args, memory_pool=None):
+            return func.call(args, None, memory_pool)
+    else:
+        def wrapper(*args, memory_pool=None, options=None, **kwargs):
+            options = _handle_options(func_name, option_class, options,
                                       kwargs)
-            return func.call([{args_sig}], options, memory_pool)
-        return {func_name}
-    """)
+            return func.call(args, options, memory_pool)
+    return wrapper
+
+
+def _make_dummy_equivalent(args_sig, kwonly, option_class):
+    # Make a dummy callable with the signature that we want to expose
+    # as the real function's signature.
+    str_sig = f"{args_sig}{kwonly}, memory_pool=None"
+    if option_class is not None:
+        str_sig += ", options=None"
+        sig = inspect.signature(option_class)
+        for p in sig.parameters.values():
+            p_default = p.default if p.default is not p.empty else None
+            try:
+                # If the default value cannot be represented accurately,
+                # use None instead

Review comment:
       Hmm, I'm not sure there's an example of this, no. We can probably simply assert for now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #10581: ARROW-10316: [Python] Improve introspection of compute function options

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #10581:
URL: https://github.com/apache/arrow/pull/10581#issuecomment-866940079


   @amol- Do you want to review this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10581: ARROW-10316: [Python] Improve introspection of compute function options

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10581:
URL: https://github.com/apache/arrow/pull/10581#discussion_r662174383



##########
File path: python/pyarrow/compute.py
##########
@@ -197,16 +215,13 @@ def _wrap_function(name, func):
     args_sig = ', '.join(arg_names)
     kwonly = '' if arg_names[-1].startswith('*') else ', *'
 
-    # Generate templated wrapper, so that the signature matches
-    # the documented argument names.
-    ns = {}
-    if option_class is not None:
-        template = _wrapper_options_template
-    else:
-        template = _wrapper_template
-    exec(template.format(func_name=name, args_sig=args_sig, kwonly=kwonly),
-         globals(), ns)
-    wrapper = ns['make_wrapper'](func, option_class)
+    # We make a generic wrapper with a simple argument-handling
+    # machinery, but we also add a __wrapped__ attribute pointing
+    # to a dummy callable with the desired signature for introspection
+    # and documentation.
+    wrapper = _make_generic_wrapper(name, func, option_class)
+    wrapper.__wrapped__ = _make_dummy_equivalent(args_sig, kwonly,
+                                                 option_class)

Review comment:
       > The only reason I could think of is that in reality the signature you want to expose is not the real one you want to use.
   
   This is exactly the case here: the function definition mostly forwards `*args` and `**kwargs` to the function kernel and the function options constructor, respectively. See the `wrapper` definitions above.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10581: ARROW-10316: [Python] Improve introspection of compute function options

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10581:
URL: https://github.com/apache/arrow/pull/10581#issuecomment-866940044


   https://issues.apache.org/jira/browse/ARROW-10316


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] amol- commented on a change in pull request #10581: ARROW-10316: [Python] Improve introspection of compute function options

Posted by GitBox <gi...@apache.org>.
amol- commented on a change in pull request #10581:
URL: https://github.com/apache/arrow/pull/10581#discussion_r662143862



##########
File path: python/pyarrow/compute.py
##########
@@ -197,16 +215,13 @@ def _wrap_function(name, func):
     args_sig = ', '.join(arg_names)
     kwonly = '' if arg_names[-1].startswith('*') else ', *'
 
-    # Generate templated wrapper, so that the signature matches
-    # the documented argument names.
-    ns = {}
-    if option_class is not None:
-        template = _wrapper_options_template
-    else:
-        template = _wrapper_template
-    exec(template.format(func_name=name, args_sig=args_sig, kwonly=kwonly),
-         globals(), ns)
-    wrapper = ns['make_wrapper'](func, option_class)
+    # We make a generic wrapper with a simple argument-handling
+    # machinery, but we also add a __wrapped__ attribute pointing
+    # to a dummy callable with the desired signature for introspection
+    # and documentation.
+    wrapper = _make_generic_wrapper(name, func, option_class)
+    wrapper.__wrapped__ = _make_dummy_equivalent(args_sig, kwonly,
+                                                 option_class)

Review comment:
       While I do see how it can work, I'm not sure I got the benefit of building a dummy function with the right signature wrapped in the real function with the wrong signature.
   
   I mean, if you are building the function with the right signature in `make_dummy_equivalent`, while not make that one not dummy, have it invoke the real compute function and directly return it?
   
   The only reason I could think of is that in reality the signature you want to expose is not the real one you want to use.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] amol- commented on a change in pull request #10581: ARROW-10316: [Python] Improve introspection of compute function options

Posted by GitBox <gi...@apache.org>.
amol- commented on a change in pull request #10581:
URL: https://github.com/apache/arrow/pull/10581#discussion_r662228994



##########
File path: python/pyarrow/compute.py
##########
@@ -197,16 +215,13 @@ def _wrap_function(name, func):
     args_sig = ', '.join(arg_names)
     kwonly = '' if arg_names[-1].startswith('*') else ', *'
 
-    # Generate templated wrapper, so that the signature matches
-    # the documented argument names.
-    ns = {}
-    if option_class is not None:
-        template = _wrapper_options_template
-    else:
-        template = _wrapper_template
-    exec(template.format(func_name=name, args_sig=args_sig, kwonly=kwonly),
-         globals(), ns)
-    wrapper = ns['make_wrapper'](func, option_class)
+    # We make a generic wrapper with a simple argument-handling
+    # machinery, but we also add a __wrapped__ attribute pointing
+    # to a dummy callable with the desired signature for introspection
+    # and documentation.
+    wrapper = _make_generic_wrapper(name, func, option_class)
+    wrapper.__wrapped__ = _make_dummy_equivalent(args_sig, kwonly,
+                                                 option_class)

Review comment:
       Uhm, if our only goal is to fake the signature, why not just use ``__signature__`` or ``__text_signature__`` which are available to expose signatures of C functions?
   
   Like
   
   ```
   def create_function(f_name, args):
       real_func = getattr(__builtins__, f_name)
       def f_impl(*args, **kwargs):
           return real_func(*args)
       f_impl.__name__ = f_name
       f_impl.__text_signature__ = "(a, b)"
       return f_impl
   
   import inspect
   f = create_function("min", ("a", "b"))
   print(f.__name__, inspect.signature(f), f(1, 2))
   ```
   
   That seem to achieve the same goal without any compile or ``__wrapped__`` magic




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org